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Genetic Variants 
Increase the Risk of Age-Rdated 
Macular Degeneration 



[01] Tins invention was made using funds from U.S. government grant no.U10£Y012118. 
and EY015216 from the National Institutes of Health (NIH)/Natioaai Eye Institute and 
by grant AG O 268 from the NTH/National Institute on Aging and by KR 00095 from 
the National Institutes of Health GCR.C. Therefore the U.S. government retains 
certain rights in the invention. 

TECHNICAL FIELD OE THE INVENTION 

[02] This invention is related to the area of genetic testing, drag discovery, and Age- 
Related Macular Degeneration. In .particular, it relates to genetic variants which 
increase the risk of Age-Related Macular Degeneration, particularly in combination 
with certain behavior. 

BACKGROUND OF THE INVENTION 

[03] Age-related macular degeneration (AMD) a gi sive impairment of central 

vision and is the leading cause of irreversible vision loss in older Americans (i). The 
most severe form of AMD involves neovaseular/exndattve (wet) and/or atrophic (dry) 
I :• s to the macula. Although the etiology of AMD remains largely unknown, 
implicated risk factors include age, ethnicity, smoking, hypertension, obesity and diet 
(2). Familial aggregation (3), twin studies (4), and segregation analysis (5) suggest 
in ! tl < 1 i 5 t i it . x ? i ' i c t ! i ! c i he ca i i r t 

approach, which focuses on testing biologically relevant candidates, has implicated 
variants in the ABCA4, FBW6, and ABOB genes as risk factors for AMD. 
Replication of the ABCA4 and FBLN6 findings has been difficult, and in toto these 
variants explain only a small proportion of AMD (6-8), An alteomtive genomic 
approach uses a combination of genetic linkage and association to identify novel 
genes involved in AMD. We participated in a repent collaborative genome-wide 
linkage screen (0) in which chromosome lq32 was identified as a likely region for an 
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AMD risk gene, a location also supported by other studies (10, U). This region 
contains between over 100 genes, (see On-line Mendehan Inheritance in Man. at the 
NCBI website) and no particular gene was identified by foes work. 

[04] Age-related macular degeneration (AMD) is a common complex disorder that affects 
the central region of the retina (macula) and is the leading cause of legal blindness in 
older American adults. The preval ence of AMD and its significant morbidity will rise 
sharply as the population ages. AMD is a clinically heterogeneous disorder with a 
poorly understood etiology Population-based longitudinal studies (Klaver et al. 2001; 
van Leeuwen et al, 2003; Klein et ah 2003) have established that the presence of 
extracellular proteindipid deposits (drusen) between the basal lamina of the retinal 
pigment epithelium (RPE) and the inner layer of Bntchs* membrane is associated with 
an increased risk of progressing to an advanced form of AMD. either geographic 
atrophy or exudative disease. The presence of large and indistinct (soft) drusen. 
coupled with RPE abnormalities is considered an early form of the disorder and is 
often referred to as age-related maeulopathy (ARM), 

fOSJ Epidemiological!^ AMD is a complex disorder with contributions of environmental 
.factors as well as genetic susceptibility (Klein et al. 2004), Many environmental and 
lifestyle factors have been postulated, but by far the most consistently implicated non- 
genetic risk factor for AMD is cigarette smoking (Smith et al. 2001), Much progress 
has recently been made in identifying and characterizing die genetic basis of AMD. In 
a remarkable example of the convergence of methods for disease gene discovery, 
multiple independent research efforts identified the Y402H variant in the complement 
factor H (CFH [(MM 134370]) gene on clnomosome lq32 as fee first major AMD 
susceptibility allele (Haines et at. 2005; Hageman et al. 2005; Klein at al. 2005; 
Edwards et al. 2005; Zareparsi ei al 2005; Cenley et al 2005). While one of the 
studies was able to pinpoint CFH on the basis of a whole-genome association study 
(Klein et al. 2005), most studies focused on the lq32 region because it had 
consistently been implicated by several whole-genome linkage scans. A second 
genomic region with similarly consistent linkage evidence is chromosome 10q26, 
which was identified as the single most promising region by a recent meta-analysis of 
published linkage screens (Fisher et ah 2005), 
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{06] 1 te 1 sped .c AMD soscepl 1 eated on 

chromosome 10q26. One used a combination of family-based and case-control 
analyses to implicate the PLEKHA1 gene (pleckstria homology domain containing, 
family A (phosphoinositide binding specific) member 1 [MIM 607772]) and the 
predicted LOC387715 gene (Jakobsdotdr et at 2005). However, the association 
signals for single-nucleotide polymorphisms (SNPs) in these two genes were 
statistically indistinguishable. A second study using two independent case-control 
datasets concluded that the T allele of SfrflP rs!0490924 in LOC3877L1 a coding 
change (Ala69Ser) in exon I of this poorly characterized gene, was the most likely 
AMD susceptibility allele (Rivera et a!. 2005). Both studies reported that the 
chromosome 10q26 variant confers an AMD risk similar in magnitude to that of the 
Y402H variant in CFH. Here, we describe highly significant association of SNPs in 
LOC387715 with AMD. In our data, only SNPs in this gene, including rsKM-90924, 
explain the strong linkage and association signal hi this region. Given a previous 
repent of an effect of cigarette smoking on the linkage evidence in the 10q26 region 
(Weeks et at. 2004; 9). we tested whether smoking modified this association. 

[07] There is a continuing need in the art to identify individual genes that are involved in 
the. pathogenesis of AMD and/or to identify particular alleles that are involved in the 
pathogenesis of AMD, as well as to identify the interaction of the genes with 
modifiable behaviors. 

SUMMARY OF THE INVENTION 

|08] According to one embodiment of the invention a method is provided fot a; sea g 
increased risk of Age Related Macular Degeneration. The identity is determined of at 
least one nucleotide residue of Complement Factor H coding sequence of a. person. 
The nucleotide residue is identified as normal or variant by comparing it to a normal 
sequence of Complement .Factor H coding sequence as shown in SEQ ID NO: t. A 
person with a variant, sequence has a higher risk of Age Related Macular Degeneration 
than, a person with a normal sequence. 
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\m} According to another embodiment a method is provided for assessing increased risk of 
Age Related Macular Degeneration. The identity is determined of at least one amino 
acid residue of Complement Factor H protein of a person. The residue is identified as 
normal or variant by comparing it to a normal sequence of Complement Factor H as 
shown in SBQ ID NO: 2, A person with a variant sequence has a higher risk of Age 
Related Macular Degeneration than a person with a normal sequence. 

[10] Another embodiment of die invention provides a method for screening for a potential 
drug for heating Age Related Macular Degeneration. A Complement Factor H 
protein is contacted with a test agent in the presence of a poiyanion. Binding of the 
poiyanion to Complement Factor H is. measured. A test agent is identified as a 
potential drag for treating Age Related Macular Degeneration if it increases binding of 
Complement Factor H to the poiyanion. 

[11] Another embodiment of the invention is a method for screening for a potential drug 
for testing Age Related Macular Degeneration. A Complement Factor H protein is 
contacted with a test agent in the presence of C-Reaetiva Protein. C-Reactive Protein 
binding to Complement Factor H is measured. A test agent is identified as a potential 
drug for treating Age Related Macular Degeneration if it increases binding of 
Complement Factor H to C-Reactive Protein. 

[12] A further embodiment of the invention is a method to assess risk of AMD in a patient. 
The presence of a T allele at rsl0490924 is determined in a patient. Whether the 
patient is a cigarette smoker is determined. The patient is identified as being at high 
risk of AMD if the patient has the T allele and is a cigarette smoker. The patient is 
identified as being at lower risk of AMD if the patient has the T allele but is not a 
cigarette smoker or is a cigarette smoker but does not have the T allele. The patient is 
identified as being at lowest risk if the patient does not have the T allele and is not a 
cigarette smoker. 

[13] Yet another embodiment of the invention is a method to assess risk and treat AMD in 
a patient The presence of a T allele at rsl0490924 is determined in a patient 
Whether the patient is a cigarette smoker is determined. If the patient has fee T allele 
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at rsl0490924 and is a cigarette smoker, behavioral therapy is provide; the p itiet it 
to encourage smoking cessation, 

(14] Stilt n ■ .1/ >diment at the invent! >n is a method to assess risk and treat AMD in 
a patient. The presence of a T allele at rs"10490924 is detemnned in a patient. 
V' tether the patient i igaretl maker i ietermined ff th pati* it ha th« f allel 
at rs 10490924 and is a cigarette smoker, the patient is provided with smokeless 
nicotine to encourage smoking cessation. 

BRIEF DESCRIPTION OF THE DRAWINGS 

115] Fig, I . Haploview plot defining haplotype block structure of AMD associated region. 
The relative physical position of each SNP is given in the upper diagram, and the 
pairwise linkage disequilibrium (D') between all SNPs is given below each SNP 
combination. Dark red shaded squares indicated D' values >0.80. D~1.0 when no 
number is given. 



116] Fig. 2, Plot of family-based and case-control P values for all SNPs within the AMD- 
associated haplotype. The genomic region spanning each gene is indicated in green. - 
iogio of the nominal P values are plotted for each SNP. Results for both the family- 
based and ease-control data sets converge within the CFH gene. 

[17} Fig. 3. Results of linkage (left axis: two-point and nndttpomt lod scores) and 
association analysis (right axis: logjo-transformed p- values from logistic regression of 
case-control dataset, using additive coding described in text and adjusted for age and 
sex). For exact p~vahies in 122-127 Mb region that are smaller than 10*\ see Table 5. 

|18] Fig. 4. ID pattern in region from FLEKHA1 [MJSM 607772] to CUZDi [H0NC 
17937], Hie relative physical position of each SNP is given in the upper diagram, and 
the pairwise D' between all SNPs is given below each SNP combination. Red-shaded 
squares indicate D ! values >0.80. DM,0 when no number is given, which is either 
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significant (dark-red shading) or non-sigm&cant (blue shading) based on the 
Haploview defeu.it definition (Gabriel et al. 2002) 

\19] Fig. 5A. genotype frequencies at ts 10490924 in unrelated AMD patients, by pack- 
years of cigarette smoking. Fig. 5B, genotype fiequencies at rs 10490924 in unrelated 
controls without AMD, by pack-years of cigarette smoking 

[20j Fig, 6, Ordered subset analysis of 90 multiplex AMD families with information on 
pack-years of cigarette smoking. Dashed tine: Multipoint LOD* in 90 families. Solid 
line: Multipoint LOD* in 40 families with >-44 pack-years, averaged across family 
members affected with AMD, 

£21] Fig. 7: Table 4, Demographic and clinical characteristics of study population 

|22J Fig. 8: Table 5. SNPs in 122-127 Mb region with psSO.005 in case-control association 
analysis. MAF: minor allele frequency. Odds ratios (OR) adjusted for age and sex, 
estimated separately for heterozygous (het) and homozygous (net) carriers of minor 
allele. P-value from additive coding of SNP covariate described in text GIST: 
Genotype-IBD sharing test (Li et al. 2004). 

[23| Fig. 9: Table 6. Two-locus genotype frequencies (%) and odds ratios for rsl0490924 
in L008771 5 and Y402H in CFH. All odds ratios adjusted for age and sex. 

124] Fig. 10: Table 7. Results of fitting two-fector models by logistic regression, adjusted 
for age and sex. Factor 1 is rsl0490924, model definitions in text. Akaike's 
information criterion (AIC) diiference is difference of the AJC from the best-fitting 
model. 

|253 Fig. 1 1 Tabic 8. Joint frequencies (%) and odds ratios for rs 10490924 in LOC3877 1 5 
and smoking history (ever vs. never). All odds ratios adjusted for age and sex. 
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[26] Fig. 12: Table 9. Minor allele frequency (MAF) and geaotype i i imnbsi of 
individuals) at rs 10490924 by AMD grade. Data for smokers and non-smokers 
tti i ed from dataset used for logistic regression modeling (Table 8). Data for all 
genotyped individuals estimated by combining family-based and case-control datasei, 
including related individuals. 

(27] Fig. 13; Supplemental Table I. SMPs identified in LOC387715 sequencing of 
individuals homozygous for rs 10490924 variant 



[28] Fig. 14; Supplemental Table 2. SMPs identified in CUZD'l sequencing of individuals 
homozygous for rs!89i 110 variant 



[29] Fig. 15: Supplemental Table 3. Case-control association results for all SNPs in 112- 
132 Mb region. 

DETAILED DESCRIPTION OF THE INVENTION 

f30] The inventors have developed raethm > es in risk c u ' t ige-Relafced 
Macular Degeneration (AMD) m affected families and in individuals not known to be 
in affected families. Although developing the disease is a multi-factorial process, 
presence of a polymorphism in the CFH gene (or complement factor H protein) 
indicates a greatly increased risk ( approximately double), Interestingly, one 
polymorphism is so prevalent in the Caucasian population that 1/3 of individuals carry 
at least one copy of that foam Moreover, identification of the CFH gene as involved 
in AMD pathogenesis permits the use of the CFH protein in drug screening assays, in 
addition, we have identified a coding change (AIa69Ssr) in the LOC387715'gene as a 
second major susceptibility allele for AMD, The overall effect of the gene on risk is 
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■ , tatistical interaction between the LOC3 877 15 variant 

and cigarette smoking. 

[31] The Y402H polymorphism (eacoded by the T1277C polymorphism) is .located in the 
domain known as SCR7. See Table 3. SCR7 is known to contain binding sites for 
both C-Reactive Protein (CRP) and polyanions, such as heparin and sialic acid. The 
location of this highly informative polymorphism suggests that not only is the CFH 
protein involved in the pathogenesis of AMD, but that the ability to bind one or both 
of C Reactive protein and poiyanions is also involved. Variations in other domains of 
CFH may also relate to pathogenesis of AMD. including variations hi domains that 
are involved in binding of complement factor C3b. Such variations may have an 
effect alone or in conjunction with the Y402H variant. 

[32] Any change in the CFH gene or encoded protein can be determined by comparing to 
the sequences of the major allele in the Caucasian population as shown, in SEQ ID 
NO: 1 and 3, for nucleotide and protein, respectively. Methods of detecting sequence 
differences between a test subject's CFH and tire major allele or major protein can be 
any method known in fee art. These include side-by-side comparisons of physico- 
chemical properties of proteins, immunological assays, primer extension methods, 
hybridization methods, nucleotide sequencing, amino acid sequencing, hybridization, 
amplification., PGR, oligonucleotide mismatch ligation assays, primer extension 
assays, heteroduplex analysis, allele-specific amplification, allele-specific primer 
extension, SCCP, DOGE, TGCB, mass spectroscopy, high pressure liquid 
chromatography, and combinations of these techniques. 

{33] Binding assays between Complement Factor H and either polyanions or OReaetive 
Protein (CRP) can be performed using any format known in the art. Binding can be 
measured in solution or on a solid support. One of tire partners may, for example, be 
labeled with a radiolabel or fluorescent label. Partners can be identified using first 
antibodies which are either themselves labeled or measured using second antibodies 
which are labeled and reactive with me first antibodies. Assay formats can be 
competitive or non-competitive. 
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[341 Test agents can be nai ora I 1 ! cts or syniltetic, purified or mixtures. They can be the 
products of 1 ombi ito il I mistry or individual products or families of products 
which are selected on the basis of structural information. Test agents are identified, as 
candidates for treating AMD if they increase the binding of complement factor H to 
any of its I ■ i including but not limited to C3b, sialic acid, 

heparin, and GRP. 

|35J The T allele is the variant of rsl0490924 that has a T at nucleotide 26 as shown in 
SEQ ID NO: 9, Other variant alleles as shown in SEQ ID MO: 7-56 can be detected 
and used to assess risk of AMD. The other variants may be used independently or 
may be used hi contraction with an assessment of smoker status. Current smokers are 
individuals who smoke at least once per week. However, historical smoking in an 
individual's past can also modify their risk of AMD. . 

[36] Behavioral therapies which can be recommended for smoking cessation include but 
arc not limited to counseling, classes, printed information, electronic information, 
video or audio tapes. Providing a behavioral therapy may involve merely 
recommending it to a patient, prescribing it, or actually deMvering the therapy. 
Smokeless nicotine is also a possible means for weaning persons from a smoking 
habit. Smokeless nicotine, like behavioral therapies, may or may not require a 
physician's prescription. Smokeless forms of nicotine that can be used for smoking 
ces a ton or abatement include but are not limited to nicotine gums, transdermal 
patches, nasal sprays, and inhalers. 

137] Because the data indicate that the variant of CFH and the variant of LOC387715 are 
independent predictive factors, they can both be assessed in the same person. 
Together, these two types of variants ate believed to account for the majority of cases 
of AMD. Additional factors as discovered can also be tested, as they become 
available to the art. 

|38j Using iterative high-density SNP a-< <. 1 v. m-tppm v< e ha\ e identified a coding 
change in the LOC387715 gene, at SNP rslO490924 s as the most likely second major 
AMD susceptibility allele. We also generated statistical evidence of gene- 
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; iment interaction tor litis variant, suggesting that a genetic suscep 
coupled with a modifiable lifestyle facte such as cigarette smoking confers a 
significantly higher risk of AMD than either factor alone. Genotype frequencies at 
rs 10490924 were strongly correlated with pack-years of smoking in AMD patients, 
consistent wife heterogeneity analysis of the genetic linkage data. It is striking that we 
have observed evidence for gene-environment interaction in two different ctatasets 
using two statistically independent approaches . However, the presence of statistical 
interaction does not prove biological interaction, and much work remains to be done 
to identify the molecular mechanism underlying the inc.! , sd AMD risk. 

(39} Our data did not support the previously reported association of AMD with the 
GRK5/RGS1Q region at -121 Mb (Jakobsdottir et al. 2005) since the four SNPs 
(hcvl 8099(52, 1S871196, rsl53757f>, rs!467813) that we genotyped in this region did 
not demonstrate significant association (p>0.05). The GIST and conditional haplotype 
analyses suggested that only rs 10490924, and surrounding SNPs in LOC3S7715 in 
high LD with it, explained the linkage and association signals in this region. See other 
SNPs in LOC387715 at SBQ ID NO: 7-56. Neither analysis supported SNPs in the 
nearby PLEKHA1 and PRSSli genes as being responsible for either the linkage or 
as o union evidence. Consistent with these results, the most significant single-SNP 
associations, the highest odds ratios, and the highest nonparametdc two-point lod 
score of 3.2 were contributed by SNPs in the LOC387715 gene. While we did not re- 
sequence the nearby FLEKHA1 and PRSS i 1 genes, we genotyped the vast majority of 
SNPs examined by the earlier studies in our dataset. Several SNPs in the CUZD1 
gene, which is not in LD with fee PLEKHA1/LOC387715 LD block, gave substantial 
association signals with logistic regression (smallest p-value: 0.0002), but allele 
t > i • hi erenc s in eases and onirols were much less pronounced for these 
SNPs (MAFcaaog ~55%, MAF^tois -48%), compared to SNPs in LOC3S7715 
(MAFcas^W, MAFcoaiMb -26%). In addition, the GIST method and the conditional 
haplotype analysis suggested that these SNPs did not explain die linkage and 
association signals in this region. 



(40) The limitations of any retrospective epidemiologic study apply to our findings, 
including the potential for recall bias of past exposures. The validity of fee summary 
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PAR% estimates depends on the extent to winch oar case-control dataset is 
representative of a population-based sample of AMD patients a nd controls. Since am 
dataset was used to identify the LQC3S77I5 susceptibility variant, it is possible that 
its effect size, and hence its PAR%, was overestimated (Lohmueller ef al. 2003; 
loamtidis et al. 2001). Independent popalation-based studies of large sample size, 
ideally collected in a pros} sc < - hi« n ue needed U confirm the statistical 
interaction between smoking and rsl0490924 in contributing to AMD and its clinical 
subtypes, and to refine estimates of their individual and joint PAR%. 

[41 j There, is currently no biological explanation for the mechanism by which LOC3 877 1 5 
may increase the risk of AMD. It is not clear whether this statistical association 
provides further support to the role of the innate immunity system that was 
highlighted by the recent discovery of the CFH gene. LOC387715 is a two-exon gene 
that encodes a protein of 107 amino acids, whose only homologue is a chimpanzee 
gene of 97% protein identity. No significant matches were found with any known 
protein motifs. ESTs have been recovered from the placenta and the testis, and this 
gene has recently been reported to he weakly expressed in the retina (Rivera et al. 
2005). 

[42] hi summary, we have replicated and refined previous reports implicating a coding 
change in LOC387715 as the second major AMD susceptibility allele. The effect of 
rs 10490924 appears to be completely independent of the Y402H variant in the CFH 
gene. The joint effect of these two susceptibility genes is consistent with a 
multiplicative model, and together, they may explain as much as 65% of the PAR of 
. AMD.' Previous data by our group suggested that the joint effects of CFH and 
smoking are also consistent with a multiplicative model {Scott et al. 2005). la 
contrast, the effect of rs 10490924 appears to be strongly modified by cigarette 
smoking. Smoking and LOC387715 together may explain as much as 34% of AMD. 
While the marginal effect of rs 10490924 was strong enough to be detected without 
incorporating smoking history information, an effect modification of a genetic 
susceptibility by a lifestyle factor like smoking lias important implications for the 
clinical interpretation of this finding. Our data suggest that the T allele at rs 10490924 
may only moderately increase the AMD risk in non-smokers and likely exerts its 
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strongest effect on heavy smokers. This has the potential to reduce toe Impact of an 
AMD susceptibility allele on the aging population by public health efforts, such as 
smoking prevention and smoking cessation programs. Our replication of the 10q26 
linkage heterogeneity due to smoking, and the consistency of results from multiple 
statistically independent approaches for assessing gene-environment interaction 
reported here, are unusual in genetic studies of complex human discuses and provide 
substantial support to our findings. 

[43] We used iterative association mapping to identify a suscej i oil it en for age-related 
macular degeneration (AMD) on chromosome 10q26, which is one of the most 
consistently implicated linkage regions for this disorder. We employed linkage 
analysis methods, followed by family-based and case-control association analysis 
using two independent datasets. To identify statistically the most likely AMD 
susceptibility allele, we used the Genotype-IBD Sharing Test (GIST) and conditional 
hapiotype analysis. To incorporate the two most important known AMD risk factors, 
smoking and the Y402H variant of the complement factor H (CFH) gene, we used 
logistic regression modeling to test for gene-gene and geae-environment interaction in 
the case-control dataset, and the ordered subset analysis (OSA) to account for genetic 
linkage heterogeneity in die family-based dataset Our tl trough/ in phcate a 
coding change (Aia69Ser) in the LOC3877IS gene as fee second major AMD 
susceptibility allele, confirming earlier suggestions. Its effect on AMD is statistically 
independent of CFH and of similar magnitude to Y402H. The overall effect is driven 
primarily by a strong association in smokers, as we observed significant evidence for a 
statistical interaction of the LGC3S7715 variant with a history of cigarette smoking. 
This gene-environment interaction is supported by statistically independent family- 
based and case-control analysis methods. We estimate that LOC287715 and smoking 
together explain 34% of the population-attributable risk (PAR) of AMD, Further, we 
estimate that LOC387715 and CFH together account for 65% of the PAR of AMD. 
For the fast time, we demonstrate that a genetic susceptibility coupled with a 
modifiable lifestyle factor such as cigarette smoking confers a significantly higher risk 
of AMD than either factor alone. 
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[44] The above disclosure generally describes the present invention. All tefere 

disclosed herein are expressly incorporated by reference. A more complete 
understanding can be obtained by reference to the foUowii 3 pk s which 

are provided herein for purposes of illustration .only, and are not intended to limit the 
scope of the invention. 

EXAMPLE 1 

[45] To identity the responsible gene on chromosome lq32. we initially geaotyped 44 
SNPs (12) across the 24 megabases (Mb) incorporating this linkage region. We 
examined two independent data sets: die first contained 182 families (111 multiplex 
and 71 discordant sibpairs) and the second contained 495 AMD cases and 185 
controls. Each SNP was tested for association independently in both data sets. Two 
SNPs (rs2019724 and > 11m erat -cqadibumn with each other 

(r~0.61) generated highly significant associations with AMD in both the family- 
based data set (rs2019724, F-0.0001; rs6428379 s /M1.0007) and in the case-control 
data set (rs2G19724, P<0.0001; rs6428379, P<0.0001). These SNPs lie approximately 
263 kilobases (Kb) apart. 

EXAMPLE 2 

1461 To define the extent of linkage chsequuibrisrn completely, an additional 17 SNPs 
wore genotyped across approximately §55 Kb flanked by rsl 538687 and rsI537319 
and ena>mpassing the 263 Kb region. Two linkage disequilibrium blocks of 1 1 Kb 
and 74 Kb were identified and were separated by 176 Kb (Fig, 1), The 11 Kb block 
contained r32019724 and die 74 Kb block contained rs6428379. Association analysis 
of the 17 SNPs identified mnitiple additional SNPs giving highly significant 
associations in one or both of the family-based and case-control data sets (Fig. 2). hi 
the case-control data set, a five SNP haplotype (GACJ0T, defined by SNPs rs 1 S3 1. 28 1 , 
rs3?53395, rsl 853883, rsl. 0494745, and 1*6428279, respectively) comprised 46% of 
the case and 33% of the control chromosomes (P-0.0003). This same haplofype was 
ilso s gnji ' > v d data set 
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(JMX00003). The convergence of the most significant associations to this same 
haplotype i» the two independent data sets strongly suggests that this region contains a 
commonly inherited variant in an AMD risk gene. 

[471 Tbe associated GAGGT Imploiype spans approximately 261 Kb. It contains the 
Complement Factor H gene (CFH, OMM #: 134370, Accession #:NM_000186) and 
the five Factor H-related genes CFHLl-5, and lies within the Regulator of 
Complement Activation (RCA) gene cluster. The most consistent association results 
(Fig. 2) font both the femily-based and case-control data sets converge within the 
CFH gene implicating CFH as the AMD susceptibility gene. The biological role of 
Complement Factor H as a component of the innate immune system thai modulates 
infiamoiatlon through regulation of complement (reviewed in 03)) enhances its 
attractiveness as a candidate AMD susceptibility gene. Inflammation has been 
repeatedly implicated in AMD pathology. C-reactive protein levels are elevated in 
advanced disease (14), anti-retinal autoantibodies have been detected in AMD patients 
(J5), macrophages are localized near neovascular lesions (16), and fee hallmark 
drasen deposits contain many complement-related proteins (i 7), 

EXAMPLE 3 

[48! We screened for potential risk-associated sequence variants in the coding region of 
CFH by sequencing 24 cases with severe aeovasouiar disease and 24 controls with no 
evidence of AMD. To maximise the likelihood of identifying the risk-associated 
allele, ail sequenced cases and controls were homozygous for the GAGGT haplotype. 
Five novel and six known sequence variants were detected (Table 1). Only one 
variant (rsl06I170 : sequence: T1277C, protein: Y402H) was present significantly 
more often in cases than controls, ocemring on 45/48 haplotypes in the cases and on 
22/48 haplotypes in the controls (PO.QG0I). The flwjpeney of sequence variants 
within the CFH coding region on the associated haplotype was significantly reduced 
in cases compared to controls (12% vs. 18%, P=0.002). When the over-represented 
T1277C variant was removed firom the analysis, this difference became more 



14 



WO 2006/096561 



PCT/L'S20<K>y»K)7725 



pronounced (3% vs. 16%, PO.00001). Thus T1277C is the primary DNA sequence 
variant differentiating between the case and control haplotypes. 
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Table 1. CFH sequence variants identified in neovaseuiar AMD cases and normal controls. 
All individuals were homozygous for the AMD-associated OAOOT haplotype. The 24 
aftec led inJ • - had severe neovascular disease (grade 5) (12) 

with diagnosis before age 74 (mean age at diagnosis: 65.8 yes). The 24 control individuals 
selected for sequencing had no ev idence of AMD (grade ] } with age ai ex 
(mean age at exam; 69.8 yrs). The six previously identified SNPs are labeled using standard 
nomenclature. The five novel variants are labeled given their base pair location on 
chromosome 1, Ensemb! build 35. Five SNPs create non-synonymous amino acid changes 
within CFH and five SMPfs create syi t j >u hanges Excra 1 is not translated 
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EXAMPLE 4 

[49} We screened for potential risk-associated sequence variants in the coding region of 
CPiJ by sequencing 24 cases wife, severe neovasoular disease and 24 controls with no 
evidence of AMD. To maximize the likelihood of idenhf. ng ft .. ' a sociated 
allele, all sequenced cases and controls were homozygous for the GAGGT haplotype. 
Five novel and six known sequence variants were detected (Table I). Only one 
variant (rs 106 11 70, sequence: T1277C, protein: Y402H) was present significantly 
more often in cases than <;o) . - i 45/4 in fee cases and on 

22/48 Implotypes in fee controls (P<0,0001), The frequency of sequence variants 
withia the CFH coding region on the associated haplotype was significantly reduced 
in. cases compared to controls (12% vs. 18%, FO.002). When the over-represented 
T1277C variant was removed from fee analysis, this difference became more 
pronounced (3% vs. 16% t P<0.00001). Thus T1277C is the primary DNA sequence 
variant differentiating between fee case and control haplotypes, 

EXAMPLE 5 

[50J Complete genotyping of T1277C in fee family-based and ease-control data sets 
revealed a significant over-transmission in the families (iM),019) (12) and a. highly 
otation in fee cases compared to controls (PKX00006). The 
odds ratio for AMD was 2.45 (95% Ci: 1 ,41-4.25) for camera of one C allele and 3.33 
(95% CI: 1.79-6.20) for camera of two C alleles. When the analysis was restricted to 
only neovascular AMD, these odds ratios increased to 3.45 (95% CI: 1.72-6,92) and 
5,57 (95% CI; 2.52-12.27), respectively. This apparent dose effect for risk associated 
wife the C allele was highly significant (PO.0001). There was no apparent allelic or 
genotypic effect of T1277C on age at AMD diagnosis (mean age at diagnosis: TT: 
76.5yrs; TC 77.5yrs; CC 75.5 yrs). The population attribxitahle risk percent for 
canying at least one C allele was 43% (95% confidence inter val 23-68%). 
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[51] The Y402H variant is predicted to have functional consequences consistent with 
AMD pathology. Residue 402 is located within binding sites for heparin (18) and C- 
reaehve protein (CBJP) (19). Binding to either of these partners increases the affinity 
of CFH for the complement protein C3b (2ft 21% atiginenting its ability to down- 
regulate complement's effect The observed ^localization of CFH, CEP, and 
proteoglycans in the superficial layer of the arterial intima suggests that CFH may 
protect the host arterial wall from excess complement activation (21), We 
hypothesize that alleie-speoific changes in the activities of the binding sites for 
heparin and CRP would alter OTTs ability to suppress complement-rektsd damage to 
arterial walls, and might ultimately lead to vessel injury and subsequent 
neovaseular/extidative changes such as those seen in neovascnlar AMD. Our data 
support this hypothesis since the risk associated with the C allele is more pronounced 
when the analyses are restricted to nonvascular AMD. Given the known functional 
interactions of genes within the RCA gene cluster (.13), variants within these genes 
could interact with or modify the effect of the T1277C variant. 

[52] Interestingly, plasma levels of CFH are known to decrease both with age and with 
smoking (23), two known risk factors for AMD (2). This confluence of genetic and 
environmental risk factors suggests an integrated etiological model of AMD involving 
chronic inflammation. Identification of the increased risk of AMD associated with the 
TI277C variant should enhance our ability to develop presymptonmiic tests for AMD, 
possibly allowing earlier detection and better treatment of tins debilitating disorder. 

EXAMPLE 6 (relates to examples 1-5) 
Participants 

[53] We ascertained AMD patients and then affected and unaffected family members 
through two clinics in the Southeastern United States - Duke University Medical 
Center (DUMC) and Vanderbilt University Medical Center (VUMC). Unrelated 
controls of similar age and ethnic background were enrolled via (i) study 
advertisement in DUMC- and VLMC-affiliated newsletters; (it) recruitment 
presentations by study coordinators at local retirement communities, who were likely 
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to obtain health care at DUMC or VUMC, respectively; (iii) AMD-related seminars 
for the general public sponsored by DUMC or VUMC ophthalmology clinics, (iv) 
referrals from, other clinics in the Duke and VantlerMit Eye Centers of individuals 
without evidence of ocular disease. Spouses of AMD patients were also asked to 
participate as potential controls. Controls eligible for enrollment were offered a free 
comprehensive eye exam including fundus photography to ensure that the same 
methodology was used to assign AMD grades as for the AMD patients and their 
relatives ascertained in clinic. All cases and controls included in this study were 
Caucasian and at least 55 years of age. The study protocol was approved by the 
respective Institutional Review Boards (1KB) at DUMC and VUMC, and the re-search 
adhered to the tenets of the Declaration of Helsinki 

154] The family-based data set consisted of ill multiplex families with at least two 
individuals with grade 3 or higher AMD in at least one eye. Seventy-three families 
had two affected individuals, 29 families had three affected individuals, and nine 
ramiHes had four or more affected individuals. Unaffected spouses and siblings were 
collected whenever possible. 71 additional families consisted of one affected 
individual and at least one unaffected sibling (discordant sibpams). 

Clinical Assessment 

[55] The assignment of AMD affection status was based on the clinical evaluation of 
stereoscopic color fundus photographs of the macula (EM', AA), according to a 5- 
gfade system described previously (SI). Grade 1 has no AMD features, grade 2 has 
only small non-extensive drasen, grade 3 has extensive intermediate and/or large 
drosen, grade 4 is geographic atrophy, and grade 5 is neovascnlar AMD, This system 
is a slight modification of the Age-Related Eye Disease Study (AREDS) grading 
system and uses example slides from the Wisconsin Grading System (S3) and the 
International Classification System (S3) as guides. Affection status was defined by 
the most severe grade it) either eye. All questionnaire data and samples were collected 
after informed consent was obtained. 
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Molecular Analyses 

Genomic DNA was extracted fiom whole blood by the Duke CHG or Vanderbilt 
CHGR DMA banking cores using fee PareGeoc system (Gentra Systems, 
Minneapolis* MN) on an Autopure LS. Genotyping was performed using Taqman on 
the ABI Prism 7900HT, and analyzed with the SDS software. SNP Assays-Oa- 
Demand or Assays-By-Design were obtained from Applied Bio&ystem3 incorporated 
(Foster City, CA). The initial set of 44 SNPs was chosen to approximate a 500 Kb 
spacing between markers. 

Exxm of CPE were PCR amplified from genomic DNA, sequenced using Big Dye 
v3.i (ABI) on an ABI 3730 automated sequencer, and analyzed using Mutation 
Surveyor software (Softgenedcs, State College, PA). T1 277C falls within a genomic 
duplication and could not be genotyped using TaqMan assays. Ail individuals were 
sequenced using primers GGTTTCTTC1TGAAAATCACAGG (SEQ ID NO: 5) and 
CCATTGGTAAAACAAGGTGACA (SEQ ID NO: 6) to detennine T1277C 
genotypes. 

Statistical Analyses 

Linkage disequilibrium and Hardy-Wemberg equilibrium calculations were done 
using Hapioview version 3,0 using all case and control samples and one random 
individual from each of the families (S4). Bapiotype blocks were defined using fee D' 
parameter and the default definitions within Hapioview. Allele frequency differences 
were tested using a % z test. 

Single-locus and bapiotype larmly-hased association was tested using the Association 
in the Presence of Linkage (APL) method (S5) that performs a correct TDT- style test 
of association in the presence of linkage, using nuclear families with at least one 
affected individual and any number of nnailected siblings or parents. Odds ratios 
were calculated using standard logistic regression models (SAS version 9.1, SAS 
Institute, Cary, NC). The outcome variable was AMD affection status and genotypes 
were coded according to a log-additive model Dose-response was tested using the % 2 
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test for trend. Hapiotype analysis in &e case-control data set was tested using die 
"haplo.stats" program that uses a likelihood-based method to estimate hapiotype 
frequencies (iS*5). 

[60] The 95% confidence interval for the population attributable risk percent (PAR%) for 
T1277C was calculated on the point estimate of the PAR% (43%), which was 
calculated from the combined frequency of genotypes CT and CC in controls and the 
unadjusted odds ratio (OR) of AMD for these genotypes relative to (he XT reference 
group (57). Calculation of the PAR% from case-control data assumes that the controls 
are representative of the general population and the disease is rate (< S% population 
prevalence across all exposure levels). PAR% calculated from OR adjusted for age 
and sex was similar. 

{611 We note that the P-vatae of the T1277C association in the family-based data set is not 
as significant as the P-vaiue for the two original SNPs. This results from the 
ascertainment bias toward severe disease in the family collection, which results in an 
oversampling of T1277C-CC homozygotes. Family-based tests of association depend 
on both transmission and association. Oversampling for homozygosity reduces the 
power of any family-based transmission disequilibrium test Since the original SNPs 
have low linkage disequilibrium values with T1277C (r-G.OO and 0.14 for rs2019724 
and rd6428379, respectively), they were not over-sampled for homozygosity to the 
extent of T1277C. in the case-control data set where the sampling bias is not as 
profound, theP-valnes for all. three SNPs are similarly highly significant. 

Hapiotype Analysis 

{62} The five SNP hapiotype block, defined by SNPs js1 831 281 , is3753395, rsi853S83, 
rs!0494745 s and rs6428279, identified five common haplotypes that capture over 95% 
of the hapiotype variation (Table 2). The GAGGT hapiotype is the most common in 
both the cases and controls, but is significantly more frequent in the cases, 

Table 2, The haplotvpes and their frequencies calculated from the case-control data. The 
hapiotype consists of SNPs isl 831281, rs3753395, rs!853883 s rsl049474S, and rs<542S279, 
respectively. 
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Table 3, Location of SCR domains in protein. 
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EXAMPLE 7 

Linkage and Association Analysis 

|63| Resentencing of the LOC387715 and CIJZD1 genes identified 21 known and 23 
novel SNPs {Supplemental Tables 1 and 2). Sequencing primers and conditions are 
available from the authors (MAH) upon request Of these 44 SNPs, 19 were 
genotyped in our entire dataset. Genotypes for all SNPs analyzed hers were in Hardy- 
Weinberg equilibrium in unrelated controls (p>0.01). We observed high LB (D 5 >0.9) 



22 



WO 2606/99tf561 



PCT/L'S20(K>/»K)7725 



across a 60 kb region including a frequent coding SNP in exon 12 of PLEICBA1 
(rsl0452l6), three coding SNPs in LOC387715 (rsl0490923 5 IS2736911, 
1-810490924) and several additional non-coding PLEKHAi and LOC3 87715 SNPs, 
replicating earlier observations (Rivera et aL 2005). Notably, the adjacent downstream 
gene PRSSI 1 (HtrA serine peptidase 1 (HTRA1), [MEM 602194]) was not included in 
this 60 kb region (figure 2). 

[64] In the family-based linkage analysis, a peak multipoint lod score was obtained at 
124.7 Mb (HLOD 3.0 under affeeteds-only dominant mode], nonparametric LOD* 
2.6, figure 1), SNP rs 106643 16 in LOC387715 (124.2 Mb) gave a maximum 
nonparametric two-point lod score of 3.2. In the case-control analysis, four highly 
correlated SNPs in the LQC387715 gene, including the frequent coding change 
rsl049G924 in exon 1 previously implicated (Rivera et al. 2005), were very stongly 
associated with AMD, with logistic regression p-values on the order of 10* s (table 5). 
The minor allele frequency (MAP) of these highly correlated SNPs was -41.7% in 
cases, very similar to that reported by Rivera et al., and -25.8% in controls, somewhat 
higher than the 19.6% reported by Rivera et at. Within the 60 kb ID block, and in the 
entire 122427 Mb region, association signals of this order of magnitude were 
observed only for this set of highly correlated SNPs. in particular, the coding SNP in 
exon 12 of PLEKHAI (rsl045216) showed substantially weaker evidence for 
association, both in terras of magnitude (odds ratio, OR) and statistical significance 
(MAF, 35SS : 28.2%, MAF COJ »toir,: 36.8%, OR^O.6, p=0.02). Unlike the previous reports, 
we detected a second region of association 400 kb distal to LOC387715 that included 
several SNPs in the CUZD1 gene and an even more distal SNP in the FAM24A gene 
(family with sequence similarity 24, member A [HGNC: 23470]). These SNPs, which 
were in LB with each other bat not in LD with the associated SNPs in LOC387715 
(figure 2), showed independent evidence for association with AMD risk, although at 
much lower statistical significance (MAfw -55%, MAF^ntrois- -48%, p=0.00Q2- 
0.0058). 



EXAMPLE 8 

GIST Analysis 



23 



WO 2006/096561 



Pf I I S2006/U07725 



[65] All SNPs with p-values <0.005 in the case-control analysis were analyzed with GIST 
to test if they explained the linkage signal in the region. Under the addi tive weighting 
scheme suggested by the ease-control analysis (Li et ai. 2004), only the four SNPs in 
the LOC387715 gene were significant in the GIST analysis (table 5). This suggests 
that the LOC387715 gene alone is responsible for the 10q26 linkage evidence. 

EXAMPLE 9 

Conditional Bapiotype Analysis 

[66] With the combined case-control dataset, we used conditional haplotype modeling to 
identify the statistically most likely AMD susceptibility variant from among all the 
SNPs with strong evidence for association. We tested each SNP in table 5, 
conditioning on the risk allele of the mast strongly associated SNP in CUZD1, 
FAM24A and LOC38771S. Conditioning on the risk allele at rs 189 11 10 m CUZDi, 
r$ 10490924 was strongly as , t I (p~7.6'J >5) whil none of the other SNPs were 
significant (p>0.05). Conditioning on the risk allele at rs2293435 sn FAM24A, 
rs!0490924 was strongly associated (p-7.1E-05) while none of the other SNPs were 
significant (pXM)5). Only conditioning on ihe risk allele at rs 10490924 fully 
explained the association signal is the region, such that none of the other SNPs 
stowed any evidence for association (p>0.6). Thus, this analysis also strongly 
implicates the LOC387715 gene alone is AMD, consistent with the Rivera et ai. 
study. 

EXAMPLE 10 

Gene-Gene Interaction analysis 

[67j We estimated joint odds ratios for all genotype combinations of the Y402H variant in 
CFH and the rs 10490924 variant in LOC38771 5 (table 6). The TT/GG combination 
was used as the referent group. For individuals wife the TT genotype at Y402H, the 
GT genotype at rs 10490924 conferred a 2.7-fold increase in AMD risk (p=0.02) and 
the TT genotype conferred a 13. 1-fold increase (p=O.003). For individuals with the 
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CC genotype at Y402H, which conferred a 4-fold increase in AMD risk for TT 
genotypes at rsl0490924 (p=0,0007), the GT genotype conferred a 12.6-fold increase 
in AMD risk (p<0.0001) and fee TT genotype conferred a 23.8-fold increase 
ftKO.OQOl). Consistent wife results of the AiC modeling strategy (table 7), the joint 
action of the Y402H and the rs 1 0490924 variants was therefore best described by 
independent multiplicative effects, without statistically significant evidence for 
dominance effects or epistaiic interaction. The joint effect of Y4G2H and rsiG490924 
accounted for 65.1% of the population attributable risk (PAR) of AMD (Bruzzi et al. 
1985). 

EXAMPLE U ' 

Case-Control Gene-Environment Interaction Analysis 

[6$) In contrast, we found strong evidence for statistical interaction of smoking and 
genotypes at rs 10490924. The mode! with the ADD^SMOKEJNT term provided a 
significantly better fit to the data by 5.2 AIC units, compared to the model without this 
term (table 7). A significant product term with positive regression coefficient for 
smoking and rs!0490924 in the logistic regression model indicated more than 
multiplicative joint effects (p-0.007). In our dataset, fee presence of the LGC387715 
susceptibility allele did not confer a significantly increased risk of AMD to non- 
smokers (p-0.59 for GT genotype, p=Q.I2 for TT genotype, table 8), while the GT 
genotype in smokers increased the risk 2.7-fold (p=0.001) and the TT genotype in 
smokers increased the risk 8.2-fold (pO.0001). A case-only analysis of rsl0490924 
and pack-years of smoking (as a continuous variable) also supported the presence of 
gene-environment interaction (p-0.05 adjusted for age and sex). The relative 
frequency of TT genotypes in affected individuals increased almost linearly with 
increasing pack-years of smoking, wife a corresponding decrease of GO genotype 
frequencies (figure 3, panel A). "Has pattern was strikingly similar to results for 
simulated data when the disease status was generated wife a logistic regression model 
including a gene-eiivhonment interaction term (Schmidt et al. 2005). Genotype 
frequencies at rs 10490924 were not related to pack-years of smoking in our control 
sample (Fig. SB), confirming that the result in cases was due to gene-environment 
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interaction rather than population correlation of the two factors. The joint effect of 
rsl0490524 and smoking accounted for 34.3% of the PAR of AMD. 

EXAMPLE 12 

Family-Based Gene-Enviroiuaeat Interaction Analysis 

[69| The highly significant association of AMD with rsl0490924 that was observed in the 
initial case-control analysis was not replicated in the family-based analysis with APL. 
This could be due to the smaller size of our family-based datasct, or to between-family 
heterogeneity. To test the latter possibility, we applied OSA to our multiplex family 
dataset, using the average pack-years of smoking in affected individuals as the OSA 
oovariate (ordered from high to low). OSA indicated that the majority of linkage 
evidence in the 10q26 region was contributed by only 40 families with an average of* 
44 pack-years of smoking (figure 4). The difference in nonparametric led scores 
between the 90 multiplex families with sufficient information to calculate average 
smoking pack-years and the 40 families with heavy smokers was significant 
(p=G.048). based on 10,000 runs of the OSA permutation test (Hauser et al 2004). 
When the APL analysis was repeated using only multiplex and singleton families 
which met the "heavy smoking" criterion in affected individuals (family-average of > 
44 pack-years of smoking, 46 families total), the results confirmed the case-control 
association analysis: The APL p-vahte for rsl0490924 and rs3750848 in LOC3S7715 
was 0.02. Tliree SNPs in other genes also had p-values of 0.02: rs760336 in PRSS11 
adjacent to LOC3S7715, rsl052715 in DMBT1 (deleted in malignant brain tumors 1 
[MIM 601969]) and hev2917031 in GPR26 (G protein-coupled receptor 26 [MM 
6048473). Neither SHP had a case-control association p-value<0.05 in the overall 
analysis. 

EXAMPLE 13 

Clinical Subgroup Analysis 

f70] It is of great clinical interest to determine whether the modification of the LOC3877! 5 
association by cigarette smoking is observed in both geographic atrophy (O A, grade 4) 
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and nonvascular AMD (CNV, grade 5). Table 9 shows that the si ss tciatioa with 
LOC3S771 5 in smokers was primarily due to genotype frequency differences between 
grade 1 controls (8,3% with genotype TT) and CNV patients (29.3% with genotype 
TT). When all gcnoryped individuals regarc . r :■ information were 

evaluated, the frequency of the T allele was higher in patients with CNV (47.6%) 
compared to GA (39.0%), Our daiaset had limited statistical power for the AMD 
subtype comparison since it included a much smaller number of GA patients, 
compared to CNV patients (table 4), and since smoMng history information was not 
available for all study participants. 

.EXAMPLE 14 (relates to examples 7-13) 
Study population 

|71J As part of an ongoing large-scale study of genetic and environmental risk factors for 
AMD, we have ascertained AMD patients, their affected and unaffected family 
members, and a group of unrelated controls of similar age and ethnic background at 
two sites in the Southeastern United States: Duke University Eye Center (DUEC) and 
VanderbiH University Medical Center (VUMC). Using stereoscopic color fundus 
photographs, ail enrolled individuals were assigned (by BAP and AA) one of five 
different grades of macular findings, as described previously (Schmidt et al. 2000; 
Seddon et al 1997) and summarised in Table 4. Our AMD classification is a 
modification of the AREDS grading system, using Wisconsin grading system example 
slides (Klein et al. 1991) and the International Classification System (Bird et al. 1995) 
as guides. The more severely affected eye was used to classify individuals. Unrelated 
- controls were enrolled via (i) study advertisement in DUEC- and VUMC-afffiiatad 
newsletters; (ii) recruitment presentations by study coordinators at local retirement 
communities, which were likely to obtain health care at DUEC or VUMC, 
respectively, and (hi) AMD-related seminars for the general public sponsored by 
DUEC or VUMC ophthalmology clinics. Spouses of AMD patients were also asked to 
participate as controls. All cases and controls included in this study were white and at 
least 55 years of age. The study protocol was approved by the lastitutjonal Review 
Boards (XRB) of the Duke University Medical Center and VUMC, the research 
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adhered to the tenets of die Declaration of Helsiiiki, and informed consent was 
obtained from all study participants. Blood samples were collected and genomic DNA 
was extracted fiom whole blood using the PureGene system (Genua Systems, 
i Iran - ip< \i$ MN) on an Autopure LS 

[72] Information about ihe smoking history of study participants; was obtained from a self- 
administered questionnaire that was formatted to maximize readability for individuals 
with low vision. However, if participants indicated thai they could not complete the 
form, a project coordinator offered to assist the participants in filling out ihe 
questionnaire. Regular cigarette smoking was assessed by two questions: 1) "Have 
you smoked at least 100 cigarettes in your lifetime?" and 2) "Did you ever smoke 
cigarettes at least once per week?" Individuals answering "yes" to both questions 
were asked the average number of cigarettes they smoked per day, the year thai they 
started smoking, whether they had quit smoking, and if so, what year. This 
information was used to calculate pack-years of smoking as (cigarettes per day * years 
smoked) / 20 cigarettes per pack. The most general measurement of smoking history 
was constructed as an "ever/never" variable based on a participant's response to 
question 1) above. 

[73J The srody population for die analysis presented here included 810 unrelated AMD 
patients with early (grade 3) or advanced (grades 4 and 5) AMD. Of these, 200 bad at 
least one sampled (affected or unaffected) relative and thus contributed to the family- 
based association analysis. The remaining 610 AMD patients without sampled 
relatives, and 259 unrelated controls without AMD (grades 1 and 2), made up an 
independent case-control dataset Demographic and clinical information for these 
individuals is shown in table 4. 

Genotyping, Linkage and Association Analysis 

f74| Previous work by our group (Kenealy et al. 2004) and others (Weeks et al. 2004; 
Majewski et al. 2003; Seddon et al. 2003; Iyengar et al. 2004) suggested the presence 
of an AMD susceptibility locus on chromosome I0q26, with the linkage peak centered 
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at approximately 122 Mb. To aarow down fee region most likely to harbor an AMD 
susceptibility al lele, we genotyped 103 SNPs tilths 112 to 132 Mb interval, extending 
10 Mb to either side of the reported linkage peak. We started with a density of 
approximately 1 SNP per 1 Mb and filled in the 117427 Mb regmn munediateiy 
surrounding the 122 Mb peak with a higher density of one SNP per 140 kb on 
average. All SNPs were selected using SNPSeiector software <Xu et at 2005} to have 
approximately equal spacing with minor allele frequency > 5%. Genotyping was 
performed wife the TaqMan allelic discrimination assay, using either Assays-Qa- 
Demand or AssaysdSy-Desiga products torn Applied Biosystems. For quality control 
(QC) procedures, two CEPH standards were included on each 96-well plate, and 
samples from six individuals were duplicated across all plates, with the laboratory 
technicians blinded to their identities. Analysis required matching QC genotypes 
within and across plates and at least 95% genotyping efficiency. The Y402H variant 
of She CFH gene was geaotyped by sequencing, as previously described (Haines et ai. 
2005), 

175} Following the first round of genotyping and statistical analysis, we applied iterative 
association mapping (OHveira et at 2005) to select another set of SNPs in the peak 
region, defined approximately as the 1-lod-scom-unit support interval surrounding the 
peak multipoint lad score. In addition to using SNPSeiector (Xu et al. 2005), SNPs 
were identified through resequeneing of the LOC387715 gene and the CUZD1 gene 
(CUB and ssona pellucida-like domains 1 [HGNC; 17937]) in 48-72 unrelated affected 
and unaffected individuals. Our final SNP density was an average of one SNP per 43 
kb, for a total of 117 SNPs in the 122-127 Mb region, and an average of one SNP 
every 220 kb outside of this interval, for a total of 185 SNPs in the 112-132 Mb 
region. 

[761 The genotype data were analyzed with MERLIN (Abecasis et al 2002) to calculate 
nonparametric two-point and multipoint LOD* scores (Kong and Cox 3997), using 
the exponential model. Allele frequencies were estimated from all geuotyped 
individuals. Parametric affecteds-only heterogeneity lod scores (HLODs) assuming a 
dominant (disease allele frequency 0.01) or recessive (disease allele frequency 0.2) 
model were also computed with MERLIN. To avoid an mtlalion of linkage evidence 
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doe to inter-marker linkage dteequiUbrimn (LD) (Boyles et al. 2005), we used recently 
described methods based on estimated haplotype frequencies of SNP clusters m high 
pairwise ID, using a threshold of ^==0.16 to define &ese clusters (Abeoasis and 
Wigginton 2005). The LD pattern in fee region of interest was analyzed with the 
Haploview program (Barrett et al. 2005), using the generated genotypes from 
•unrelated AMD patients as the input. Association analysis was applied to ail SNPs in 
the 122-12? Mb region, using the family-based Association in the Presence of Linkage 
(APL) test (Martin et al. 2003) and standard logistic regression analysis for case- 
control comparisons with adjustment for age and sex (SAS version 8.02, SAS Institute 
Inc., Gary, NC). An additive coding scheme was used, with the SNP model eovariate 
taking on values -1, 0 and i for genotypes 1/1, 1/2, and 2/2, and 2 being the minor 
allele in controls. As described above, we divided our total sample into cases 
contributing to the APL {analysis (affected individuals with at least one sampled 
relative, n-200 families), and an independent sample of cases without sampled 
relatives (n-610) who were compared to 259 unrelated controls. We used the 
Genotype-IBP Sharing Test (GIST) method (Li et al. 2004) to examine which of the 
most strongly associated SNPs best explained the linkage evidence in the region. We 
also used the COCAPHASE module of the UNPHASED software package 
(Dndbridge 2003) to perform conditional haplotype analysis. This analysis tested 
whether conditioning on the risk allele at a particular SNP accounted for the 
association signal in the region. If the association signal in the region was driven by a 
single SNP, conditioning on its effect was expected to remove all evidence of 
association for the remaining SNPs. 



Interaction Analysis 

{77] We conducted additional analyses to incorporate effects of the two most important 
known AMD risk factors, smoking and the CFH gene. First, we fit a series of logistic 
regression models to the combined case-control data set (including probands from, 
family-based dataset) to identify the model that best described (1) the joint effects of 
CFH and LOC387715, and (2) the joint effects of smoking and LOC387715. We 
followed a recently proposed modeling strategy (North et al. 2005) in which the best- 
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fitting model was derived on the basis of Akaike's Information Criterion (AIC). The 
AIC compares different models with a log-likelihood ratio test that is penalized for the 
number of model parameters to identify the most parsimonious model that adequately 
fits the date. For each genotype, two model terms were tested: one coding for additive 
effects at the first, second, or both loci (AUDI, ADD2, ADDBOTH), using the coding 
described above, and the other one coding for dominance effects (DOM I, DOM2, 
DOMBGTH), with a value of -0.5 for genotypes 1/1 and 2/2, and a value of 0,5 for 
genotype 1/2. Three additional models (ADDINT, ADDDOM, DOMINI) were fit to 
test for deviation from joint additive or joint dominance effects of CFH and 
LOC3877I5, and two additional models (ADDJSMOKEJNT, DOMJSMOKEJNT) 
were fit for LOC387715 and smoking (comparing ever- vs. never-sraokers). Models 
for which the AIC differed by less titan 2 units were considered statistically 
indistinguishable (North et al. 2005), and the model with fewer parameters was 
chosen as the best fitting one. For example, when the addition of the ADDINT term 
did not provide a substantially better model fit, tins was interpreted as lack of 
evidence for statistical interaction between the two factors. Thus, they each had 
independent mam effects that were mtdtiplicative (additive on the logarithmic scale) 
such mat the best estimate of the odds ratio for being exposed to both factors was the 
product of She two main effect odds ratios. 

Our second approach for incorporating AMD-associated covariates was motivated by 
earlier reports of the 10q26 linkage evidence being due primarily to families with 
heavy smokers (Weeks et al. 2004). Similar to the previous study, we used an ordered 
subset analysis (OSA) (Mauser et al. 2004) with the family-average of smoking pack- 
years as a eovariate. To avoid an undue influence of zero pack-years values on family 
averages, pack-years were coded as missing for non-smokers. Using die high-to-low 
ordering of family-averaged pack-years, OSA tested whether a subset of Sonnies with 
heavy smokers provided significantly greater linkage evidence than the reference 
dataset, which in this case was restricted to families for whom non-missing eovariate 
values could be computed. Thus, the baseline lod score was computed for families in 
winch there was at least one affected smoker with pack-years information. 
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WE CLAIM: 

1 . A method for assessing increased risk of Age Related Macular Degeneration, 
comprising: 

detenu ining identity of at least one nucleotide residue of Complement Factor 
H coding sequence of a person: 

identifying die nucleotide residue as normal or variant by comparing it to a 
normal sequence of Complement Factor II coding sequence as sho wn, in SEQ 
ID NO; I, wherein a person with a variant sequence has a higher risk of Age 
Related Macular Degeneration fern a person with a normal sequence. 

2. A method for assessing increased risk of Age Related Macular Degeneration, 
comprising: 

determining identity of at least one amino acid residue of Complement Factor 
H protein of a person; 

identifying the residue as normal or variant by comparing it to a normal 
sequence of Complement Factor H as shown, in SEQ ID NO : 2, wherein a 
person with a variant sequence has a higher risk of Age Related Macular 
Degeneration than a person with a normal sequence. 

3 . The method of claim I wherein the at least one nucleotide is located in an exon 
encoding a polyanion binding domain. 

4. The method of claim 3 wherein the polyanion binding domain is selected from the 
group consisting of SCR 7,. 1244 s and 19-20, 

5. The method of claim 3 wherein the polyanion binding domain is a heparin binding 
domain selected from the group consisting of SCR 13 , 19, and 20. 

6. The method of claim 3 wherein the polyanion binding domain is in SCR 7. 
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7. The method of claim 1 wherein the at least one nucleotide is located in an exoa 

encoding C-reactive protein binding domain, 
g. The method of claim 6 wherein the C-reactive protein binding domain is in SCR 7 . 

9. The method of claim 1 wherein the at least one nucleotide is located in an exon 
encoding a C3b binding domain. 

10. The method of claim 8 wherein fee C3b binding domain is in an SCR selected from 
the group consisting of 1-4, 12-14, and 19-20. 

11. The raeflrod of claim 1 wherein the nucleotide variant identified is at nt 127 7 of SEQ 
ID NO: 1. 

12. The method of claim 2 wherein the ammo acid variant identified is at residue 402 of 
SEQ ID NO: 3. 

13. The method of claim I wherein the nucleotide variant identified is T1277C of SEQ ID 
NO: I . 

14. The method of claim 2 wherein the amino acid variant identified is Y402H of SEQ ID 
NO: 3. 

15. The method of claim 2 wherein the at least one amino acid residue is located a 
polyanion binding domain. 

1 6. The method of claim 14 wherein fee polyanion binding domain is selected from the 
group consisting of SCR 7, 12-14, and 19-20. 

17. The method of claim 14 wherein the polyanion binding domain is hi SCR 7. 

IS. The method of claim 2 wherein the at least one amino acid residue is located in a C~ 
reactive protein binding domain. 

19. The method of chum 17 wherein the C-reactive protein, binding domain is In SCR 7. 

20. The method of" claim 2 wherein fee at least one amino acid residue is located in a C3b 
binding domain. 

21. The method of claim 19 wherein the C3b binding domain is in an SCR selected from 
the group consisting of 1-4, 12-14, and 19-20. 

22. A method for screening for a potential drug for treating Age Related Macular 
Degeneration, comprising: 



40 



WO 2606/99tf561 



PCT/OS20(K>/«()7725 



contacting a Complement Factor H protein with a test agent in the presence of 
a polyanion; 

measuring polyanion binding to Complement Factor H; 

identifying a test agent as a potential drug for treating Age Related Macular 
Degeneration if it increases binding of Complement Factor H to the potyamoa 

23. The method of claim 22 wherein the polyanion is heparin, 

24. The method of claim 22 wherein, the polyanion is sialic acid. 

25. A method for screening for a potential drag for treating Age Related Macular 
Degeneration, comprising: 

contacting a Complement Factor II protein with a test agent in the presence of 
C-Reaetive Protein; 

measuring C-Reactive Protein binding to Complement Factor H: 

identifying a test agent as a potential drug for treating Age Related Macular 
Degeneration if it increases binding of Complement Factor H to C-Reactive 
Protein. 

26. The method of claim 1 wherein (he nucleotide residue is determined by hybridization. 

27. The method of claim 1 wherein the nucleotide residue is determined by primer 
extension. 

28. The method of claim 1 wherein the nucleotide residue is determined by nucleotide 
sequencing. 

29. The method of claim 1 wherein the nucleotide residue is determined by alkie-specific 
amplification. 

30. The method of claim 2 wherein the ammo acid residue is determined by means of an 
antibody. 

31. A method to assess risk of AMD in a patient comprising: 

determining whether the patient has a T allele at rs 10490924; 
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detepmaing whether the patient is a cigarette smoker; and 
identifying the patient as: 

being at high risk of AMD if the patient has the T allele 

and is a cigarette smoker, 

being at lower risk of AMD if the patient has the T 
allele but is not a cigarette smoker or is a cigarette 
smoker but does not have the T allele, and 
being at lo west risk if fee patient does not have the T 
allele and is not a cigarette smoker. 

32. A method to assess risk of and treat AMD in a patient comprising: 
detei-minmg whether the patient has a T allele at. rsl 0490924; 
detercoimtig whether the patient is a cigarette smoker; and 
providing the patient with a behavioral therapy to encourage smoking 

cessation if the patient has the T allele at rsl 0490924 and is a. cigarette smoker. 



33. A method to assess risk of and treat AMD in a patient comprising: 

determining whether the patient has a T allele at rs 1 0490924 ; 
determining whether the patient is a cigarette smoker; and 
providing the patient wife smokeless nicotine to encourage smoking cessation 
if the patient as the T allele and is a cigarette smoker. 

34. The method of claim 32 wherein the step of providing comprises prescribing the 
behavioral therapy. 

35. The method of claim 32 wherein the behavioral therapy is counseling. 
36; The method of claim 32 wherein the behavioral therapy is a class. 

37. The method of claim 32 wherein the behavioral therapy is information. 

38. The method of claim 32 wherein the information is printed matter, 

39. The method of claim 32 wherein the mfbrmation is on a data storage medium. 

40. The method of claim 32 wherein the information is on an audio tape. 

41. The method of claim 32 wherein the information is on a video tape. 

42. The method of claim 33 w herein the sm »keless n me is nicotine gum. 
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43. The method of claim 33 wherein fee smokeless nicotine is in a transdermal patch. 

44. The method of claim 33 wherein the smokeless nicotine is m a nasal spray. 

45. The method of claim 33 wherein the smokeless nicotine is hi an inhaler. 

46. The method of claim 33 wherein the step of providing comprises prescribing or 
•recommending a form of smokeless nicotine. 

47. The method of claim 3 1 farther comprising determining if the patient has a variant of 
Complement Factor H protein or coding sequence, 

48. The method of claim 47 wherein a variant protein is determined. 

49. The method of claim 47 wherein a variant coding sequence is determined. 
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49,19 


AlAi 


0,8188 


rs/91/Ooil 


1/o,u/9/mj 


1148 


35,51 


0,2795 


*v,7Dc1.ii7£C 

fyo94/03 


1^.11514 i 


33.89 


47,58 


0,5345 


rs/yiooob 


1/D.lD/Iad 


3871 


13,27 


0,5677 




1A1swy4 


24,46 


14.92 


0.4149 




■i*>E ISO /A 


45.28 


36.80 


0.7321 


re'H'JW) 


uo,4iuyisu 


43.52 


14,17 


0.4423 


rS4»u/U/ 


■i^K ARtCXiA 


32.39 


14,63 


0,9485 


rs9060 


1/3.490445 


30.22 


13,36 


0,7472 


fs1914525 


1Atw5t)/b 


15.72 


25.80 


0 205 s 


rs/U( S003 


l/Ms/o/o 


22.51 


14,57 


0,1366 


►»7nrrn-i ■( 7 
FS/UaO I i f 


]/0.o394rc> 


13.32 


25.71 


0.9231 


rstooo/44 


utuuobb/ 


46.34 


25,71 


0,7689 




i/5./4unU 


23,73 


26.41 


0.5292 


r$4ji)//&> 


1/5.0,3941 b 


32.24 


12.92 


0 ' C 


<-<-7r.7C£ 4 s 

{S/woOlo 


l/D.oW/ll/ 


27,98 


36,84 


0.1094 




1/3,914003 


35.93 


45.45 


0,8753 




ua,8Dl4B8 


32.11 


45,47 


0.4991 




■i Off flCflfnrs 


4413 


24.04 


0,8571 




■ac f?&7fw0 


32.93 


34.82 


0.0815 


n>MEQ70fin 

fSllOS/ooU 


l^o.Uco/ts/ 


34.77 


42,33 


O.00S3 




1/0. Wow 


35.40 


29.17 


0,6351 


rc'WMBflG 


UD. 1(009/ 




AA 38 


0,4645 




f£Q,iCyO£!?4 


JiJ.UO 


VIA 
£.04 


l>.£!44 




9W7S3 
li»,40W/0 




21.77 


0.1658 




f/0./o{K540 


34.70 


20.47 


0.8067 


mU/3t>oo9 


1/0.4/004/ 


33.27 


48.98 


0.4949 






30.40 


48,71 


0,5699 


fSfU/v41fo 


l/o,4UU14 


34,93 


48.86 


0,1605 


13/280450 


12o.426<;o9 


46.97 


4.20 


0.0644 


fs2303611 


128.597979 


45.66 


17.80 


0.0884 


rs7323382 


• 126.556403 


25.41 


3.17 


0.5893 


rs1O90184l 


126.622376 


24.10 


2,02 


0.8183 


rs718949 


126,729019 


23.31 


1.91 


0.4SS8 


rs1715S73 


126.815593 


. 18.81 


2.19 


0,4638 


rs2886276 


126.904176 


37.92 




0.7160 


rs2017040 


127.071823 


31.65 


HK- 


0.9394 



FIG. 15D 



SUBSTITUTE SHEET (RULE 26) 



wo imtwtmi 



n I i S20«(i 007725 



20/20 



rs768761 


127.124751 


30.02 


26,92 


0.2250 


rs2304264 


127.254905 


8,59 


8,78 


0.8927 


rs§422913 


127.335597 


11,47 


10.60 


0.8795 




It/ ADuW 


3803 


34 48 


08909 


B2281955 


127.474807 


43,30 


47.24 


0.2344 


ts7095359 


127.951592 


20.66 


21.18 


0.5940 


rs2768070 


128,978387 


47,17 


44.51 


0.3878 


IS1001990 


130,298844 


28,52 


23.84 


0,0510 


rs7091540 


131.310286 


59,25 


44.82 


00983 


fs91382S 


131.945479 


25.43 


23.81 


0.6598 



FIG. 15E 



SUBSTITUTE SHEET (RULE 26) 



wo 2(mmt>5(,t 



Pt i S S2006/U07725 



<110> Pericak -Vance , Margaret 
Haines , Jonathan 
Postel, Sric 
Agarwal, Anita 
Hauser, Michael 
Schmidt, Si Ike 
Scott, Williiara K. 

<120> Genetic Variants Increase the Risk of 
Age -Related Macular Degeneration 



<13 0> 000250.00041 



«150> SO/650,208 
<1SI> 2005-03-04 

<160> 56 



<17C» FastSEQ £or Windows Version 4.0 

<21Q> 1 
<211> 3926 

<2i2> m& 

<213> Homo sapiens 



<220> 
<221» CDS 

<222> (74} . . , (3767) 



<220> 

<22l> variation 

<222> (1277) . . . (1277) 

<223> polymorphic variation 



<400> 1 

aafetcttgga agaggagaac tggaogttgt gaa 
aaagafcccaa aaaatgagac ttetagcaaa gat 



tfcfcttggagt aaagagaaac caaagtgtgt ggaaatttca 
aaatggatct cctstatctc agaagattat fctataaggag 
atgtsacatg ggttatgaat acagtgaaag aggagatgct 



agctggtaaa 


tgtcctctta 


60 


cttatgttat 


gggctatttg 


120 


acagaaattc 


tgacaggttc 


180 


tataaatgco 


gccctggata 


240 


gaatgggttg 


ctcttaatcc 


300 


gatactoctt 


ttggtacttfc 


360 


gctgtgtata 


catgtaafcga 


420 


gacacagatg 


gatggaccaa 


48 0 


acagcaccag 


agaatggaaa 


540 


tttggacaag 


cagtacggtt 


600 


atgcattgtt 


cagacgatgg 


€60 


tgcaaatccc 


cagatgttat 


720 


aatgaacgat 


ttcaatataa 


780 


gtatgcactg 


aatctggatg 


840 


ecttatatte 


caaatggtga 


900 


atcacgtacc 


agtgtagaaa 


960 



tgcfeccgaga 
tcatgagaak 
ctgtgatgaa 
agatggatgg 
tggatataat 
ecatcctggc 
gtctcctact 
gaatgggttt 
atgcaaaefca 
agatggatgg 
tgccagaact 
ccafcgatggt 
tggttggtct 
acacttagtt 
ctgcaa&cca 
gtctcctgac 
cctcaatggg 
atattattgc 
agagtggaca 
acttgaacat 
attcaatfcgc 
agtatggacc 
aaatttaatt 
cataaggtac 
atgggatcca 
gattcccaat 
tgttctttgc 
aagatggcag 
agaaeacgga 
attgagttat 
catgggaaaa 
gatttctcat 
gtacaaatgt 
aaaatggtct 
aaatgccata 
caettgcgca 
afcggaaagga 
tgcttatata 
atgtaggagc 
gacggaacca 
caatggggac 
ccaatgccag 
atggtcagaa 
ttataacata 
agfctgaatfct 
aacafcgttgg 
aagtgcacac 
fcattgtttta 
tataagctga 



atgogfcagac 
cattttgaga 
tcgccagcag 
caaaafcfcafcg 
tac.gctcttc 
cccagatgca 
atttctgaat 
ggacatgcaa 
tcagctcaac 
aaaaacgact 
tatgaaagca 
gatttacccs 
cctgatcgca 
ggatttacaa 
ctcccaatat 
aatgttaagg 
aatcctagat 
actttaccag 
ggctgggccc 
tcagaatcat 
caacfctcccc 
atacttgagg 
agstgtagag 
gaagtgaact 
tctcacaata 
caagaaaatt 
tcaataccac 
accattaafcb 
actfcgtgagg 
tggagttetc 
ggtgttgfcag 
fcttgaaggtt 
caccctccat 
cccatgggag 
acatattac:a 
aggccaacat 
gtgtcgagac 
ccttatgaaa 
ecteaatgca 
attactfccat 
aacttgtatc 
ecacoaaaat 
gcattaaggt 
gtgtgtaaac 
gatgggaaac 
otttattcag 
ctccttttta 
gaccggtggc 



taatac 

cat&etttcc 
ctccgtcagg 
taccatgccfc. 
gaagaaagtfc. 
caaaagcgaa 
tccgtgtcaa 
ctcagtafcac 
cag«agafcgg 

tcacatggtt 
atactggaag 
tatgttatga 
agaaagacca 
tagttggacc 
gtaaagagca 
aaaaaacgaa 
ttefcaatgaa 
tgtgtattgt 
agctttcttc 
ttacaafcgat 
agtgtgtggc 



:sttt 



gaaaagaagg 
gatcaatggc 
tgacaaccac 
atctaattca 
tctgtgfctga 
catccaggfcc 
gtggtttcag 
cacctcagtg 
otcacatgtc 
ttggaattga 
catgcataaa 
agaagaagga 
aaatggatgg 
gcagagacac 
agatgagtaa 
tgtttgggga 
aagattctac 
tcccgttgtc 
aacttgaggg 
gcttaeatcc 
ggacagccaa 
ggggatatcg 
fcggagtatcc 
aactfctagta 
ttcatacgta 
tctctt 



fctatccagac 
agtagctgta 
aagttactgg 
cagaaaatgt 
tgtacagggt 

aacatgttcc 
afcatgcctta 

taaatettgt 
taagctgaat 
caccactggt 
aagagaatgc 
gtataaagtt 
taattccgtt 
agtacaatca 
agaagaatat 
gggacctaat 
ggaggagagt 
ccctccttat 
tggacacaga 
aatagataaa 
aaacaagaag 
atggatacac 
acaaatacaa 
actgaatfcat 
ggaaggagaa 
aaaaattcca 
ttcacaagaa 
gatatctgaa 
tgaaggcctt 
agacagttat 
tgggcctgca 
aacagattgt 
tgtgtataag 
agccagtaat 
ctcctgtgtg 
atatccatcfc 
tgaagaagtg 
aggaaaatgt. 
agtatatgct 
taacaagcga 
gtgtgtaata 
acagaagctt 
tctfctcafcca 
aacfctgtgca 
ttaaatcagfc 
aaattttgga 



acaagtaotg 
attaaacatg 
ggaaaatatt 
gatcacatte 
tattttcctt 
aaatctatag 
aeatgtatgg 
aaatcaagta 
aaagaaaaag 
ggatcaatta 
gatatcccag 
gacacatfcgg 
tccatagtgt 
gaacttccfca 
ggagaggtgt 
eagtgctacc 
tgtggtncac 
ggacacagtg 
aaaattcaat 
acctgtggag 
tactatggag 
tcaattacgt 
efctaagaagfc 
gaattcgatc 
acagtetgca 
ttatgcccac 
cgggatggag 
gaaa-ctacat 
tgttcacaac 
agttatgcac 
gaaaatgaaa 
ccttgtaaat 
cagtatggag 
attgcaaaat 
ctcagtttac 
gcgggfcgagc 
gtaacatgca 
aatccgccca 
ggtgagagag 
atgtgtttaa 
gggcccccte 
ccagcttcat 
ataacatgta 
ccccgagaaa 
tattcgagaa 
agttctcaca 
aaaagataga 
tctcaatttc 
fctaatttgtg 



gctggata 
gaggtctata 
actcctatta 
attgcacaca 
atttggaaaa 
acgttgcctg 
agaatggctg 
tagatattga 
cgaaatatca 
gatgtgggaa 
tatttatgaa 
actatgaatg 
gtggttacaa 
aaatagatgt 
tgaaattctc 
acttcggatt 
ctcctgaaafc. 
aagtggtgga 
gtgttgatgg 
atacacctga 
attcagtgga 
gtattcatgg 
gcaaatcatc 
at&attctaa 
taaatggaag 
ctccacctca 
aaaaagtatc 
gnaaagatgg 
cacctcagat 
atgggectas 
caacatgcta 
ctccac:ctga 
aagaagttac 
gcttaggaga 
ctagctttga 
aagtgactfca 
ttaacagcag 
cagtacaaaa 
tacgttatca 
atggaaactg 
bacctattga 
oagttgagta 
gaaatggaaa 
ttatggaaaa 
caggtgaatc 
cattgcgaac 
atcaatcata 
attttttatg 
aaaatgtaat 



I'f I I S2«(H)/*Kf7725 



1G80 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

IS 60 

1620 

1680 

1740 

1800 

1860 

1920 

1930 

2040 

2.100 

2160 

2220 

22S0 

2340 

2400 

2460 

2 520 

2580 

2640 

2700 

276C 

2820 

2880 

2940 

3000 

3060 

3120 

31S0 

3240 

3300 

3360 

3420 

34S0 

3540 

3600 

3660 

3720 

3780 

3840 

3S00 

3926 



<210> 2 

<211> 3926 

<212> TMh 

<213> Homo sapiens 



wo umw(>5<>i 



Pf I I S20(K>/<K)772S 




CDS 

(74) . . . (3767) 



<22Q> 

<2'Al> variation 
<222> (1277) . . . (1277) 



<223> polymorphic- variant 
<400> 2 

aattcttgga agaggagaac tgqacgir.tgt gaacagagnt; agctggtaaa tgtcctcfcta 
aaagstccaa aaaatgagac ttetagcaaa gattatttgc cttatgttat gggctatttg 
tgtagcagaa gattgcaatg aactfcocfccc aagaagaaat acagaaattc tgacaggttc 
acaggctatc tataaafcgec gccctggata 
jtctfc ggaaatgtaa taatggtatg caggaaggga gaatgggttg ctnfctaatcc 
attaaggaaa agtcagasaa ggccctgtgg acatcetgga gatactcctt ttggtactct 
taccctta > t ] t jt t fcgaata tggfcgtaaaa gctgtgtata catgtaatga 

>u gtg a ^-.taarra cegtgaatgt gacacagatg gafcggaccaa 
tgat-attcca atatgtgaag ttgtgaagtg tttaccagtg acagcaccag agaatggaaa 
aatagtaaga agfcgcaafcgg aaccagatcg ggaata-aeat tttggacaag cagtacggtt 
tgtatgtaac fccaggctaca agattgaagg agatgaagaa atgcattgtt cagacgatgg 
tttttggagt: aaagagaaac eaaagtgfcgfc ggaaatttca tgcaaatcce cagatgttat 
aaatggatct cctatatcto agaagafcfcat ttataaggag aatgaacgat ttcaatataa 
acgtaacatg ggttatgaat acagtgaaag aggagatgct gtatgcactg aatot.ggafcg 
gcgtccgttg eettcatgtg aagaaaaatc atgtgataat ccttatatto caaatggtga 
ctactcacct ttaaggatta aacacagaac tggagatgaa afccacgtacc agfcgtagaaa 
tggtttttat cctgcaaccc ggggaaatac agccaaatgc acaagfcactg gctggatacc 
tgctccgaga tgtaccfctga aaccttgtga ttafcccagsc attaaacatg gaggtctata 
fccatgagaah atgcgtagao catactttcc agtagctgta ggaaaatatt actcctatt* 
ctgtgatgaa cattttgaga ctccgtcagg aagttacfcgg gatcacattc attgcacaca 
tc i jgatgg tcgccagcag taccatgcct cagaaaatgt tattttcctt atttggaaaa 
tggatataat caaaatcatg gaagaaagtt fcgtacagggt aaatctatag acgttgcctg 
ccatcctggc tacgctcttc caaaagcgea gaccacagtt acatgtatgg agaatggctg 
gtctcctact cccagatgca tccgtgtcaa aaoatgttcc aaatcaagta tagatattga 
gaatgggttk afcttctgaat ctcagtatac atatgactta aaagaaaaag cgaaatatca 
atgcaaacfca ggatatgfcaa cagcagatgg tgaaacatca ggatcaatta gatgtgggaa 
agatggafcgg tcagctcaac ccacgtgcat taaatcttgt gatatcccag tatttatgaa 
tgc:cagaact aaaaatgaat fccacatggtt taagctgaat gacaeattgg actatgaafcg 
• itgaaagca atactggaag caocacfcggt tccatagtgt gtggttacaa 
tggttggtct gatttaccca tatgtfcafcga aagagaatgc gaacttccta aaatagatgt 
aoacttagtt cctgafccgca agaaagacc* gtataaagtt ggagaggtgt tgaaattctc 
cfcgcaaacca ggatttacaa tagttggacc taattccgtt eagtgctacc actttggatt 
gtctccfcgac ctcccaafcat gfcaaagagca agtacsatca tgfcggtccac ctcotgaact 
ccteaatggg aatgttaagg aaaaaacgaa agaagaatat qgacacagtg aagtgg* , ; 
atattattgc aatcctagat ttctaatgaa gggacctaat aaaattcaat gtgfctgatgg 
g 3 i ggaca actttaccag tgtgtattgt ggaggagagt acctgtggag atafcacctga 
acttgaacat ggctgggccc agctttcttc ccctccttafc tactatggag attcagfcgga 
at-tcaatt-gc tcagaatcat tcacaatgat bggacacaga tcaafctacgt gtattcatgg 
agtatggacc caacttcccc agtgtgtggc aatagataaa cttaagaagt gcaaatcatc 
aaat.tt.aatb afcacttgagg aacatttaaa aaacaagaag gaattagafcc ataattctaa 
cataaggtaa agaCgtagag gaaaagaagg atggatacac acagtetgca taaatggaag 
gctcaatggc acaaatacaa tfcatgcccac ctccacctaa 
gactcccaat tcfccacaata fcgacaaccac actgaattafc cgggatggag aaaaagtatc 
tgttctttgc caagaaaatt atctaattca ggaaggagaa gaaattacat gcaaagatgg 



60 
120 
180 
240 
300 
3S0 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
3.140 
1200 

12 SO 
1320 

13 8 0 
1440 
iSOO 

lseo 

1S20 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
23-10 
2400 
2460 
2520 
2580 
2640 



i?* - ; _ < ' .u. SCt 
attgagttat acttgtgagg 
catgggaaaa tggagttete 
gatttctcst ggtgttgtag 
gtacaastgt tttgaaggtt 
aaaatggtct eacoctecat 
aaatgecata cccafcgggag 
eacttgtgca acatatfcaca 
atggacagga aggecaacat 
tgcttatata gtgtcgagsc 
atgtaggage ccttatgaaa 
3 iacca cetcaatgea 
aatggg ic attact teat 
ccaatgccag aacttgtatc 
afcggtcagaa ccaceaaaat 
ttataacata gcautaaggt 
agttgaatfcfc gtgtgtaaac 
aacatgttgg gatgggaaac 
aagtgcacac ctttattcag 
tattgtttta ctccttttta 
tataagetga gaccggtggc 



tctgtgttga 
'Sfecaggfcc 
gtggtttc&g 
cacctcagtg 
ctcacatgtc 
ttggaattga 
eatgeataaa 
agaagaagga 
aaatggafcgg 
gcagagacac 
agafcgagfcaa 
tgtttgggga 
aagattctac 
tcccgttgtc 
aaofcfcgaggg 
gcttacatcc 
ggacagccaa 
ggggatatcg 
tggagtatcc 
aaetttagca 
tteataagta 
fcetett 



tgaaggectt 
agacagttat 
fcgggcctgca 
aacagattgt 
tgtgtataag 
agecagtaat 
ctcetgtgtg 
atafcecatet 
fcgaagaagtg 
aggaaaatgt 
agfcatatget 
taacaagega 
gfcgtgtaata 
acagaagctt 
tctttcatca 
aacttgtgca 
ttaaatcagt 
aaattfctgga 



ccttgtaaat 
cagfcatggag 
attgeaaaat 
ctcagtttac 
gcgggtgagc 
gtaacatgea 
aatccgccca 



atgtgtttaa 
gggccccctc 
acagcttcat 
ataacatgta 
tcccgagaaa 
tattcgagaa 
cgttctcaca 
aaaagataga 
fcetcaattte 
ttaatttgtg 



cacGtcagPCT/D! 
atgggactaa 
caacatgeta 
ctccacctga 
aagaagttac 
gcttaggaga 
ctagctttga 
aagfcgacfcta 
ttaatagcag 
cagtac&aaa 
tacgttatca 
atggaaactg 
cacetattga 
cagttgagta 
gaaatggaca 
fctatggaaaa 
caggtgaatc 
cattgegaac 
atcaatcata 
attttttatg 
aaaatgtaat 



S20{k>/0()7725 

2760 
2 820 
2330 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 



3480 
3 540 
3500 
3860 
3720 
3780 
3840 
3900 
3926 



<210> 3 
<2lt> 1231 
<212> PRT 

<2I3> Homo sapiens 
<220> 

<221> SIGNAL 
<222> (!) . . . (IS) 



<220> 

<22I> VARLMfT 
<23S> (402) ... (402) 
«:223> polymorphic residue 



<400» 3 

Met Arg Leu Leu Ala Lvs He He Cys Leu Met Leu Trp Ala lis Cys 

1 5 10 15" 

Val Ala Glu Asp Gys Asa Glu Leu Pro Pro Arg Arg Asn Thr Glu He 

20 25 " " 30 

Leu Thr Gly Ser Trp Ser Asp Gin Thr Tyr Pro Glu Gly Thr Gin Ala 

33 40 45 

He Tyr Lye Cys Arg Pro Gly Tyr Arg Ser Leu Gly Asn Val He Met 

50 ^ 55 60 

Val Cys Arc hym Gly Glu Trp Val Ala Leu Asa Pro Leu Arg Lys Cys 
65 70 75 80 

Gin Lys Arg Pro Cys Gly His Pro Gly Asp Thr Fro Phe Gly Thr Phe 

SB 90 SB 

Thr Leu Thr Gly Gly Asn Val Phe Glu Tyr Gly Val Lys Ala Val Tyr 

100 105 HO 

Thr Cys Asn Glu Oly Tyr Gin. Leu Leu Gly Glu lie Asn Tyr Arg Glu 

115 120 125 

Cys Asp Thr Asp Gly Trp Thr Asn Asp lie Pro Ho Cys Glu Val Val 



. WO 2<H>6/0'.)6S61 



1 3 f s X4U 

rc Giu Asn Sly Lye lis Val Ser Ser 
145 ' ISO 155 160 

Ala Met Glu Pro Asp Arg Glu Tyr His Phe Gly Gin Ala Val Arg Phe 

165 170 175 

Val Cys Asn Ser Gly Tyr Lys He Glu Gly Asp Glu Glu Met His Cys 

180 185 190 

Ser Asp Asp Gly Pbe Trp Ser Lys Glu Lys Pro Lys Cys Val Glu lie 

195 200 205 

Ser Cys Lys Ser Pro Asp Val lie Asn Gly Ser Pro lie Ser Gin Lys 

210 215 220 

He He Tyr Lys Glu Asn Glu Arg Phe Gin Tyr Lys Cys Asn Met Gly 
225 23 0 235 240 

Tyr Glu Tyr Ser Glu Arg Gly Asp Ala Val Cys Thr Glu Ser Gly Trp 

245 250 255 

Arg Pro Leu Pro Ser Cys Glu Glu Lys Ser Cys Asp Asn Pro Tyr He 

260 265 270 

Pro Asn Gly Asp Tyr Ser 5?rc Lea Arg He. Lys His; Arg Thr Sly Asp 

275 280 285 

Glu He Thr Tyr Gin Cys Arg Asn Gly Phe Tyr Pro Ala Thr Arg Gly 

290 295 300 

Asn Thr Ala Lys Cys Thr Ser Thr Gly Trp He Pro Ala Pro Arg Cys 
305 310 315 320 

Thr Leu Lys Pro Cys Asp Tyr Pro Asp He Lys Kis Gly Gly Leu Tyr 

325 330 335 

His Giu Asn Met Arg Arg Pro Tyr Phe Pro Val Ala Val Gly Lys Tyr 

340 345 350 

Tyr Ser Tyr Tyr Cys Asp Glu His Phe Glu Thr Pro Ser Gly Ser Tyr 

355 360 , 365 

Trp Asp His IX© His Cys Thr Gin Asp Gly Trp Ser Pro Ala Val Pro 

370 375 380 

Cys Leu Arg Lys Cys Tyr Phe Pro Tyr Leu Glu Asn Gly Tyr Asn Gin 
335 ' 390 395 400 

Asn Tyr Gly Arg Lys Phe Val Gin Gly Lys Ser He Asp Val Ala Cys 

405 410 415 

His Pro Gly Tyr Ala Leu Pro Lys Ala Gin Thr Thr Val Thr Cys Met 

420 425 430 

Glu Asn Gly Trp Ser Pro Thr Pro Arg Cys He Arg Val Lys Thr Cys 

435 440 445 

Ser Lys Ser Ser He Asp He Glu Asn Gly Phe Ha Ser Glu Ser Gin 

450 * 455 460 

Tyr Thr Tyr Ala Leu Lys Glu Lys Ala Lys Tyr Gin Cys Lys Leu Gly 
465 470 475 ' 480 

Tyr Val Thr Ala Asp Gly Glu Thr Ser Gly Ser lie Arg Cys Gly Lys 

485 490 495 

Asp Gly Trp Ser Ala Gin Pro Thr Cys He Lys Ser Cys Asp He Pro 

500 505 510 

Val Phe Mat Asn Ala Arg Thr Lys Asn Asp Phe Thr Trp Phe Lys Leu 

515 520 "" ' 525 

Asn Asp Thr Leu Asp Tyr Glu Cys His Asp Gly Tyr Glu Ser Asn Thr 

S30 535 540 . 

Gly Ser Thr Thr Gly Ser lis Val Cys Gly Tyr Asn Gly Trp Ser Asp 
545 550 555 560 

Leu Pro lie Cys Tvr Glu Arg Glu Cys Giu Leu Pro Lys He Asp Val 
565 S70 575 



PCT/L'S20<K>/»}07725 
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1 if ^S^^^^S" 1 " 

Leu Lvs Phe Ser Cys Lys Fro Gly ?he Tfar He Val Gly Pro Asa Ser 

535 600 SOS 

Val Gin ctvp Tvr His She Gly Leu Ser Pro Asp Lea Pro lie Cys Lys 

610 ' " 615 620 

Gl\i Gut Val Gin Ser Cys Gly Pro Pro Pro Glu Leu Leu As is. Gly Asn 
625 630 635 640 

Val Lys Ola Lvs Thr Lys Glu Glu Tyr Gly His Ser Glu Val Val Glu 

645 650 655 

Tvr Tyr Cys Asn Pro Arg Phe Leu Met Lys Gly Fro Asn Lys He Gin 

SS0 S6S S70 

Cys Val Asp Gly Glu Trp Thr Thr Leu Pro Val Cys lie Val Glu Glu 

675 600 685 

Ser Thr Cys Gly Asp lie Pro Glu Leu Glu His Gly Trp Ala Gin Leu 

690 695 790 

Ser ser Pro Pro Tyr Tyr Tyr Gly Asp Ser Val Glu Phe Asn Cys Ser 
705 710 715 

Glu Ser Phe Thr Met lie Gly His Arg Ser He Thr Cys He His Gly 

725 730 735 

Val Trp Thr Gin Leu Pro Gin Cys Val Ala He Asp Lys Leu Lys Lys 

740 745 750 

Cys Lys Ser Ser Asn Leu He He Leu Glu Glu His Leu Lys Asn Lys 

753 760 765 

Lys Glu Phe Asp His Asn Ser Asn He Arg Tyr Arg Cys Arg Gly Lys 

770 775 780 

Glu Gly Trp He His Thr Vai Cys He Asn Gly Arg Trp Asp Pro Glu 
785 790 795 800 

Val Asn Cys Ser Met Ala Gin He Gin Leu Cys Pro Pro Pro Pro Gin 

§05 810 SIS 

He Pro Asn Ser His Asn Met Thr Thr Thr Leu Asn Tyr Arg Asp Gly 

820 825 830 

Glu Lys Val Ser Val Leu Cys Gin Glu Asn Tyr Leu He Gin Glu Gly 

835 84 0 845 

Glu Glu He Thr Cys Lys Asp Gly Arg Trp Gin Ser He Pro Leu Cys 

850 855 860 

Val Glu Lys He Pro Cys Ser Gin. Pro Pro Gin He Glu His Gly Thr 
865 870 S75 880 

He Asn Ser Ser Arg Ser Ser Gin Glu Sar Tyr Ala His Gly Thr Lys 

885 890 895 

Leu Ser Tyr Thr Cys Glu Gly Gly Phe Arg He Ser Glu Glu Asn Glu 

900 305 ' 910 

Thr Thr Cys Tyr Met Gly Lys Trp Ser Ser Pro Pro Gin Cys Glu Gly 

915 920 925 

Leu Pro Cys Lys Ser Pro Pro Glu He Ser His Gly Val Val Ala His 

930 ' 935 " 940 

Met Ser Asp Ser Tyr Gin Tyr Gly Glu Glu Val Thr Tyr Lys Cys Phe 
945 950 955 960 

Glu Glv Phe Gly He Asp Gly Pro Ala He Ala Lys Cys Leu Gly Glu 

965 970 975 

Lys Trp Ser His Pro Pro Ser Cys He Lys Thr Asp Cys Leu Ser Leu 

980 985 S90 

Pro Ser Phe Glu Asn Ala He Pro Met Gly Glu Lys Lys Asp Val Tyr 

995 1000 1005 

Lys Ala Gly Glu Gin Val Thr Tyr Thr Cys Ala Thr Tyr Tyr Lys Met 



W0 2«Wii%S<;i 1020 PCT/liS20{K)/flt!7725 

- - \ M t Thx C r I ls Asn Ser Arg Trp Thr Sly Arg 

1025 1030 1035 1040 

Pro Thr Cvs Arg Asp Thr Ser Cys Val Asn Pro Pro Thr Val Gin Asn 

1045 1050 10S5 

Ala Tyr He Val Ser Arg Gin Met. ser Lys Tyr Pro Ser Gly Glu Arg 

1060 10S5 1070 

Val Arg Tyr Gin Cys Arg Ser Pro Tyr Giu Met Phe Gly Asp Glu GlU 

1075 1080 108S 

Val Met Cys leu Asn Gly Asn Trp Thr Glu Pro Pro Gin Cys Lys Asp 

1090 1095 1100 

Ser Thr Gly Lys Cys Glv Pro Pro Pro Pro He Asp Asa Gly Asp He 
1105 " 1110 HIS 1120 

Tor Ser Phe Pro Leu Ser Val Tyr Ala Pro Ala Ser Ser Val Gin Tyr 

1125 II 35 
Gin Cys Gin Asn Leu Tyr Gin Leu Glu Gly Asn Lys Arg He Thr Cys 

1140 1145 HS0 

Acq Ask Glv Gin Trp Ser Glu Pre Pre Lys Cys Leu His Pro Cys Val 

1155 1160 use 

He Ser Arc? Glu He Met Glu Asn Tyr Asn He Ala Leu Arg Trp Thr 

1170 ~ H75 H30 

Ala LVS Gin Lvs Leu Tyr Ser Arg Thr Gly Glu Ser Val Glu Phe Val 
1185 ' 1190 H93 1200 

Cys Lys Arg Gly Tyr Arg Leu Ser Ser Arg Ser His Thr Leu Arg Thr 

1205 1210 1215 

Thr Cys Trp Asp Gly Lys Leu Glu Tyr Pro Thr Cys Ala Lys Arg 
1220 1225 ' 123 0 



<210> 4 

<211> 1231 

<212> PET 

<313> Homo sapiens 



<220> 

<221> SIGNAL 
c222> {!}.,, (18) 



<220» 

<221> VARIAST 
<222> (402) , , . {402) 
<223> polymorphic residue 



«4C0> 4 

Met Arg Leu Leu Ala Lys He He Cys Leu Met Leu Trp Ala lie Cys 

1 5 10 15 

Val Ala Glu Asp Cys Asn Glu Leu Pro Pro Arg Arg Asn Thr Glu lie 

30 25 30 

Leu Thr Gly Sex Trp Ser Asp Gin Thr Tyr Pro Glu Gly Thr Gin Ala 

35 40 45 

He Tyr Lys Cys Arg Pro Gly Tyr Arg Ser Leu Gly Asn Val He Met 

SO ■ 55 60 

Val Cys Arg Lvs Gly Glu Trp Val Ala Leu Asn Pro Leu Arg Lys Cys 
65 ' 70 75 80 

Gin Lys Arg Pro Cys Gly His Pro Gly Asp Thr Pro Phe Gly Thr Phe 



90 ss FC17US2OM0O772S 

1 1 ~. . ?na Glu Tyr Gly val Lys .Ala Val Tyr 

100 105 110 

Thr Cys Asn Glu Gly Tyr Gin Leu Leu Gly Glu lie Asn Tyr Arg Glu 

115 120 125 

Cys Asp Thr Asp Gly Trp Tax Asn Asp He Pro lie Cys Glu Val Val 

130 135 140 

Lvs Cys Leu Pro Val Thr Ala Pro Glu Asn Gly Lys He Val Ser Ser 
145 150 155 160 

?la Met Glu Pro Asp Ara Glu Tyr His Phe Gly Gin Ala Val Arg Phe 

165 170 175 

Val Cys Asn Ser Gly Tyr Lys lie Glu Gly Asp Glu Glu Met. His Cys 

180 185 190 

Ser Asp Asp Gly Phe Trp Ser Lys Glu Lys Pro Lys Cys Val Glu He 

195 200 205 

Ser Cys Lys Ser Pro Asp Val lie Asn Gly Ser Pro He Ser Gin Lys 

210 215 220 

He He Tvr Lys Glu Asn Glu Arg Phe Gin Tyr Lys Cys Asn Met Gly 
225 23 0 235 240 

Tyr Glu Tyr Ser Glu Arg Gly Asp Ala Val Cys Thr Glu Ser Gly Trp 

245 250 255 

Arg Pro Leu Pro Ser Cys Glu Glu Lys Ser Cys Asp Asn Pro Tyr He 

260 265 270 

Pro Asn Gly Asp Tvr Ser Pro Leu Arg He Lys His Arg Thr Gly Asp 

275 280 285 

Glu He Thr Tyr Gin Cys Arg Asn Gly Pha Tyr Pro Ala Thr Arg Gly 

290 295 300 

Asn Thr Ala Lys Cvs Thr Ser Thr Gly Trp He Pro Ala Pro Arg Cys 
305 310 315 ' 320 

Thr Lau Lys Pro Cys Asp Tyr Pro Asp He Lys His Gly Gly Leu Tyr 

32B 330 335 

Hie Glu Asn Met Arg Arg Pro Tyr Phe Pro Val Ala Val Gly Lys Tyr 

340 345 350 

Tyr Ser Tyr Tyr Cys Asp Glu His Phe Glu Thr Pro Ser Gly Ser Tyr 

355 3«0 365 

Trp Asp His He His Cys Thr Gin Asp Gly Trp Ser Pro Ala Val Pro 

370 375 380 

Cys Leu Arg Lys Cys Tyr Phe Pro Tyr Leu Glu Asn Gly Tyr Asn Gin 
335 ' 390 395 400 

Asn His Gly Arg Lys Phe Val Gin Gly Lys Ser He Asp Val Ala Cys 

405 410 415 

His Pro Gly Tyr Ala Leu Pro Lys Ala Gin Thr Thr Val Thr Cys Met 

420 425 430 

Glu Asn Gly Trp Ser Pro Thr Pro Arg Cys He Arg Val Lys Thr Cys 

435 440 445 

Ser Lys Ser Ser He Asp He Glu Asn Gly Phe He Ser Glu Ser Gin 

450 455 460 

Tyr Thr Tyr Ala Leu Lys Glu Lys Ala Lys Tyr Gin Cys Lya Leu Gly 
465 " 470 475 480 

Tyr Val Thr Ala Asp Gly Glu Thr Ser Gly Ser lie Arg Cys Gly Lys 

485 490 ' 495 

Asp Gly Trp Ser Ala Gin Pro Thr Cys He Lys Ser Cys Asp lie Pro 

500 505 510 

Val Phe Met Asn Ala Arg Thr Lys Asn Asp Phe Thr Trp Phe Lys Leu 
515 520 525 
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Gly Ser Thr Thr Gly Ser He Val Cys Gly Tyr Asn Gly Trp Ser Asp 
545 S50 S55 560 

Leu Pro He Cys Tyr Glu Arg Glu Cys Glu Leu Fro Lys lie Asp Val 

565 570 S7S 

His Leu Val Pro Asp Arg Lys Lys Asp Gin Tyr Lys Val Gly Glu Val 

580 58S 590 

Leu Lvs Phe Ser Cys Lys Fro Gly Phe Thr He Val Gly Pro Asn Ser 

595 ' 600 60S 

Val Gin Cvs Tyr His Phe Gly Leu Ser Pro Asp Leu Pro He Cys Lys 

6X0 * SIS «20 

Glu Gin Val Gin Ser Cys Gly Pro Pro Pro Glu Leu Leu Asn Gly Asn 
62S 630 S35 640 

val Lys Glu Lys Thr Lys Glu Glu Tyr Gly Hie Ser Glu Val Val Glu 

545 650 SSS 

Tyr Tyr Cys Asn Pro Arg She Leu Met Lys Gly Pro Asn Lys He Gin 

660 6S5 670 

Cys Val Asp Gly Glu Trp Thr Thr Leu Pro Val Cys He Val Glu Glu 

675 680 68S 

Ser Thr Cys Gly asp lie Pro Glu Leu Glu His Gly Trp Ala Gin Leu 

S90 695 700 

Ser Ser Pro Pre Tyr Tyr Tyr Gly A3p Ser Val Glu Phe Asn Cys Ser 
705 710 715 720 

Glu Ser Phe Thr Met He Gly His Arg Ser He Thr Cys He His Gly 

72S 730 735 

Val Trp Thr sin Leu Pro Gin Cys Val Ala He Asp Lys Leu Lys Lys 

740 745 ?50 

Cvs Lys Ser Ser Asn Leu He He Leu Glu Glu His Leu Lys Asn Lys 

7SS 760 76S 

Lys Glu Phe Asp His Asa Ser Asn He Arg Tyr Arg Cys Arg Gly Lys 



770 



780 



Glu Gly Tro He His Thr Val Cys He Asn Gly Arg Trp Asp Pro Glu 
785 790 795 800 

Val Asn Cys Ser Met Ala Gin He Gin Leu Cys Pro Pro Pro Pro Gin 

005 810 315 

He Pro Asn Ser His Asn Met Thr Thr Thr Leu Asn Tyr Arg Asp Gly 

820 825 830 

Glu Lys Val Ser Val Leu Cys Gin Glu Asn Tyr Leu He Gin Glu Gly 

835 640 845 

Glu Glu He Thr Cys Lys Asp Gly Arg Trp Gin Ser Ha Pro Leu Cys 

850 855 860 

Val Glu Lys lie Pro Cys Ser Gin Pro Pro Gin lie Glu His Gly Thr 
365 870 87S ' 880 

He Asn Ser Ser Arg Ser Ser Gin Glu Ser Tyr Ala His Gly Thr Lys 

885 890 " 895 

Leu Ser Tyr Thr Cys Glu Gly Gly Phe Arg He Ser Glu Glu Asn Glu 

S00 905 910 

Thr Thr Cys Tyr Met Gly Lys Trp Ser Ser Pro Pro Gin Cys Glu Gly 

915 920 ; 925 

Leu Pro Cys Lys Ser Pro Pro Glu He Ser His Gly Val Val Ala His 

930 935 940 . 

Met Ser Aso Ser Tyr Gin Tvr Gly Glu Glu Val Thr Tyr Lys Cys Phe 
345 950 955 7 " ' 960 

Glu Gly Phe Gly He Asp Gly Pro Ala He Ala Lys Cys Leu Gly Glu 
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. t ' »> iu ier Civs r«- Lys Thr Asp Cys Leu Ser Leu 
9813 985 390 

Pro Ser Phe Glu Asa Ala He Pro Met Gly Glu Lys Lys Asp VaX Tyr 

995 1000 1005 

Lyra Ala Gly Glu Gin Val Thr Tyr Thr Cys Ala Thr Tyr Tyr Lys Met 

1010 101S 1020 

Asp Gly Ala Ser Asn Val Thr Cys lie Asn Ser Arg Trp Thr Gly Arg 
1025 1030 1035 1040 

Pro Thr CVS Arg Asp Thr Ser Cys Val Asn Pro Pro Thr Val Gin Asn 

1050 10ES 
Ala Tyr He Val Ser Arg Gin Met Ser Lys Tyr Pro Ser Gly Glu Arg 

1060 1065 1070 

Val Arg Tyr Gin Cys Arg Ser Pro Tyr Gin Met Phe Gly Asp Gin Glu 

1075 1080 1085 

Val Met Cys Leu Asn Gly Aan Trp Thr Glu Pro Pro Gin Cys Lys Asp 

1030 1095 1100 

Ser Thr Glv Lys Cys Gly Pro Pro Pro Pro He Asp Asn Gly Asp lie 
1105 * " 1110 HIS 1120 

Thr Ser Phe Pro Leu sex Val Tyr Ala Pro Ala Ser Ser Val Glu Tyr 

1125 ' 1130 1135 

Gin Cvs Gin Asn Leu Tyr Gin Leu Glu Gly Asn Lys Arg He Thr Cys 

1140 1145 1150 

Arg Asn Glv Gin Trp Ser Glu Pro Pro Lys Cys Leu His Pro Cys Val 

1155 1160 1155 

lie Ser Arg Glu He Met Glu Asn Tyr Asn He Ala Leu Arg Trp Thr 

1170 1175 1180 

Ala Lys Gin Lys Leu Tyr Ser Arg Thr Gly Glu Ser Val Glu Phe Val 
1185 ' 1190 1195 1200 

Cys Lys Arg Gly Tyr Arg Leu Ser Ser Arg Ser His Thr Leu Arg Thr 

1205 1210 1215 

Thr Cys Trp Asp Gly Lys Leu Glu Tyr Pro Thr Cys Ala Lys Arg 
1220 1225 1230 



<210> 5 

<211> 23 

<2T2> DMA 

<2"13> Homo sapiens 

<40O> 5 

ggtttettct tgsaaatcstc agg 

<210> 6 

<211> 22 

<212> i>m 

<213> Homo sapiens 

<400> 5 

ecattggtaa aacaaggtga ca 

<210> 7 
<211i> 107 
<2L2> -?RT 

<2i3> Homo sapiens 



10 



WO 2006/096561 PCT/US2006/007725 



CX^O 1 ^' ^M' ^''^ 


6 £1 '7 


? i; : 1 •■-ti 




f<Sefc Leu Arg 


iieu Tyx 


Pre Glv Fro Mat Val Thr Glu Ala Glu Gly 


Lys 






10 15 




Gly Gly t'ro 




Ala Ser Leu Ser Ser Ser Val Val Pro Val 


Ser 


20 


25 30 




Plie lis Ser 




Arg Glu Ser Val Leu Asp Pro Gly Val Gly 


Qly 






40 45 




Glu Gly Ala. 


Ser Asp 


Lys Gin Arg Ser Lys Leu Ser Leu Ser His 


Ser 


SO 




55 60 




Met lie Pro 


Ala Ala Lys He His Thr Glu Leu Cys Leu Pro Ala 




65 




70 75 


80 


Phe Ser Pro 


Ala Gly 


Thr Gin Arg Arg Phe Gin Gin Pro Gin His 


His 



65 90 95 



Leu Thr Leu Ser He He His Thr Ala Ala Arg 
xoo ios 



<2io> a 

<211> 808 
<2I2> TMA 

<213> Homo sapiens 



<4 00> S 

gagatggcag ctggcttggc aaggggacag 
cctacatgct gcgcctataa ccaggaecga 
ctgagatggc aagtctgtcc tcctcggcgg 
agtcfcgtgct ggaccctgga gttggtggag 
tgtott.tatc acactccafcg atcccagctg 
ccttettctc tcctgctgga acccagagga 
tgfcctatcat ccacactgca gcaaggtgat 
ctggagcttc tcatcagcat caatgtgaag 
cctcacaacc tagactggte ccetfcccctc 
tcccacctgc ggccacacfcg tgeaaoctqq 
catcaccaat tggatgeate tfcctgctctg 
agatgeagee caatcttetc etaacatefcg 
txatcctgcc tttgbfcttct tgccctcctt 
ttaaataaaa atfccagatca fcecttgea 



cacctttgto accacattat gtccctgtac 60 
tggtaactga ggcggagggg aaaggagggc 120 
ttcetgtgtc cttcattfccc actctgegag 180 
aaggagecag tgacaagcag aggagcaaac 240 
ctaaaatcca caetgagctc tgcttaccag 300 
ggttccagca gcatcagcac cacctgacac 360 
tetgecasaa catatctcct taaaagcoaa 420 
ccaaaaatcc ttaggaggac agagggagtc 480 
cagetgcctc aactgtccac aggaetctct 540 
aatttcccca cotgggcgga etcafceaogt 600 
tgcagctggt gaaatctfctcs fccaaceefctg 660 
gattcctctc tgtcactgea ttccctcctg 720 
tctctcccgg gtgataggca ttaactaaaa 780 
808 



<210> 9 

<211> 51 

<212> m& 

<213> Homo sapiens 

<400> 9 

tttatcacac tccatgatcc cagcttectaa aatccacact gagctetget t 
<210> 10 

<21I> 51 ' 

<212> IMA 

<213> Homo sapiens 



<400> 10 

gtggaaacct cagectgett ctegtyeggg ttgttagagg agfceatttag a 
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<212> DNA 

<21 >.> Kcrao sapiens 

tatt ctcacggctt tccagkgctc atttfctcctg ctcatttatg • 



<2X2> DHA 

<213> Homo sapiens 



gttagaggag tcatttagaa agctgkaeoa ttetttcaat attctcacgg c 



<210> 13 
«211> 51 
<212> BNA 

<213> Homo sapiens 
<400> 13 

ctcagccfcgc fctctcgtccg ggttgktaga ggagtcattt agaaagctgt a 

<21C> 14 

<211> 51 

<212> DHA 

■c213> Homo sapiens 

<400> 14 

tccgggttgt tagaggagtc atttaraaag afcgtaccatt ctttcaatat t 

<.210> 15 

<211> 51 

<212> DHA 

<213> Homo sapiens 

<40Q» 15 

taccattctt teaatattct cacggytttc cagtgctcat ttttcctgct c 

<210> 16 
<211> 51 
<212> DMA 

<213> Homo sapiens 
<400> 16 

gaaactgagc agcageaggc ctgggkttgg cttttaagU fccfcatafctta a 

<21Q> 17 

<211> 50 

<212> D1JA 

<213> Homo sapiens 

<400> 17 

eatattacta aatotatttt ttttteagtc tateatccac actgcagcaa 



- 12 - 



wo mwansst 

<211> 50 
<212> UNA 

<213> Kama sapiens 
<4QQ> 18 

ttgfctttctt gccctccttt ctcteecggg tgataggcat taacfcaaaat 

<2I0> 19 
<211> 50 
c312» DNA 

c213> Homo sapiens 
•e4Q0> 19 

gttttettge cctcctttct ctoccgggtg ataggcatta actaaaatta 

<210> 2 0 

«21I> S3, 

<212> JMK 

<213> Homo sapiens 

<400> 20 

ggacgcccfca tctaaaaaac aaaaamcaaa aaaa&aaaag aaaaaagaaa a 

<210> 21 
<211> SI 
<212> DKA 

<213> Homo sapiens 
«40Q> 21 

ctcgagsgga tgccctatct aaaaamcaaa aaaeaaaaaa aaaaaagaaa a 

<210> 22 
<211> SI 
<212> Dm 

<213> Homo sapiens , 
<400> 22 

aaaaatagta aaacaacaac aacaamaaaa aaacaacasa aaatcccaaa a 

<21C> 23 
<2X1> SO 

<2i2> m 

<213> Homo sapiens 
<400>- 23 

caacetaaaa tatogtcatg tgtctttaaa aatgcatatt actaaatcta 

<210> 24 

<21X> 51 

<212> DMA 

<213> Homo sapiens 

<400> 24 



- 13 - 



<210> 25 
<211> SI 

<?A3> Homo sapiens 
<400> 25 

tatgtcoctg taccctacat gctgcrccta tacccaggac cgatggtaac t bl 

<210> 26 

<211> 51 

<312> DKA 

<213> Homo sapiens 

fia.cfatgat.tt caacggatao taggg^octc tgtfcgcctcc tctggcagag c =1 

<210> 27 
<2'11> 51 

<213> Homo sapiens 
•=400.> 27 

taattcagtt ggtctggaat agtttktttt ttccttttat tttttatttt t 51 

<210> 28 

<21i> 51 

«212> OTA 

<213> Homo sapiens 

<400> 2S 

gactagagat gccaagcatc ttcfccrtgtg tttatttgtg ctcttagagt t 51 

<21C> 23 

<211> 51 

<212> DNA 

<213> Kotao sapiens 

<400> 29 5 
acttgctgca tttcaaatgc ttggcrgtca catgtagtta gtggctaccc t ^1 

<210> 30 

<211> 51 

<212> DNA 

<213> Homo sapiens 

tccacaggac tctcttccca cctgcrgcca cactgtgcaa cctggaattt c 51 



<212> mh 

<213> Homo sapiens 
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t L < - > t^tsicgtc atcaccaatt ggatgcatct t 51 

<21C> 32 

<211> 51 

<212> DBA 

<213> Homo sapiens 

<400> 32 

tttttttttt cctttfcattt fcfctatwtttt fcgagacagag tcfctgctctg t 51 

<210> 33 

<2U> 53. 

<2I2> USA 

<213> Eomo sapiens 

<40O> 33 

aaagtgetce tcaacetaas atatcrtaat gtgtctttaa aaatgcatat t 51 

<210> 34 

<311> 51 

<212> »KA 

<213> Homo sapiens 

«400> 34 

ggagcfctetc atcagcatca atgtgmagcc aaaaatcctt aggaggaaag a 51 

«210> 35 

<211> 51 

<212> DHA 

<2I3> Homo sapiens 

<400> 35 

aagccaactg gagcttctea tcagcrtcaa tgfcgaagcca aaaatcctta g 51 

<310> 36 
<211> 51 

<2i2> mm 

<213> Homo sapiens 
<400> 36 

agcagtgcat gaggattctg atttdcccac atccfctgctg atacttgtta t 51 

<210> 37 

<21X> 51 

<212> DHA 

<213> Homo sapiens 

<400> 37 

ggcggaccca tcacgtcatc acoaaytgga tgcatcttct gctctgfcgca g 51 

<210> 38 
<211> 51 

<2i2> mh 

<213> Homo sapiens 



- IS - 



ccagccaatt tttgtatttt tagtaeagcc aggattccac c&tgtfcagec a 

<210> 39 
<211> 51 

<2i2> mh 

<2.13> Homo sapiena 
<400> 3S 

tgttgatfctg ctgatgacta gagatrccaa gcatcttctc atgtgtttat t 

<210> 40 

<211> SI 

<212> OKA 

<213> Homo sapiens 

<4QQ> 40 

tccaaagcag ctataccatt etacawtccc actagcagfcg cafcgaggatt c 

<21Q> 41 

<2X1> 51 

<212> DHA 

«213> Komo sapiens 

<400> 41 

agagaaagaa tctgggcctt acaggycacg ttggtttaaa atttagacat c 

<210> 42 

<Z11> SI 

<212> DNA 

<213> Homo sapiexis 

«40Q> 42 

tttaaaaafcg catattacta aatctrtfcfcfc tttttcagfcc tatcatccac a 

<2I0> 43 

<211> 51 

<212> DMA 

<213> Homo sapiens 

<400> -13 

ctcgatctcc tgagctcgtg atcfcgyccac cttggcfctcc caaagtggfcg g 

<210> 44 

<2TX> 51 

«212* DMA 

<213> Homo sapiens 

<400> 44 

ttcttgccct cetttctctc ccgggkgata ggcattaact aaaattaaat a 

<21Q> 4 5 
<211> SI 
<212> DNA 



- 16 - 



WO201M mm PCT/CS29WW07725 

<400> 45 

gctgccattt aggcaaaatg gtttamcatt gaatcaagga cattatgagc c SI 

«2IG= 46 
<21X> 51 

<2i3i> Homo sapiens 

ragag ccccaggcag ccaccraaag gtcttgaatg aoagcttgtc a 51 

<210> 47 

<2ll> 51 

<212> DWA 

-<213> Homo a Leas 

<400> 47 

eettccccta aatcagtegc afcgagrccag cagtccacct ttgcattaat t 5a 

<210=> 48 
<211> 51 

<2i2> mm 

<213> Homo sapiens 
<400> 48 

atgcaactga tttaggggaa gggttygeet aaattaataa aagatctgaa t Si 

<210> 43 

<2X1> 51 

«212> DMA 

<213> Homo sapiens 

«400> 49 

tcctgtgtcc ttcatttcca ctetgygaga gtctgtgctg gaccetggag t « 



<212> DMA 

<213> Komo sapiens 

<4QG> 50 

tttctctccc gggtgatagg cafctaractaa aar.fcaaafcaa aaattcagat c 

<21Q> 51 

<2X1> 51 

<212> DMA 

<213> Homo sapiens 

<400> 51 

ttgccctcct ttctcteccg ggtgayaggc attaaetaaa attaaataaa a 

<a210> 52 
<211> 51 



- 17 - 



Pf I I S20(K>/<K)772S 



<4 00> 55 

ctgaggtggg aggatcacct gagccsagga gfcatgaggct gcagtgagce a 

<210> 53 

<2XI» 51 

<2I2> DMA 

<213> Homo sapiens 

<400> S3 

catattacta aatctatttt ttttttcagt ctatoatcca cactgcagca a 

<210> 54 

<211> 51 

<212> UNA 

<2X3> Homo sapiens 

<400> 54 

ttgttttctt gccctccttt ctctcfcccgg gtgataggca ttaaccaaaa t 



<211> 51 

<212> OHA 

<213> Homo sapiens 

<4Q0> 55 

gttttcttgc cctcctttct ctccctgggt gataggcatt aactaaaatfc a 
<210> SS 

<211> 52 

«222> XMK 

<213> Homo sapiens 

<400> 56 

caaectaaaa tatcgtcatg tgtctattta aaaatgoata ttactaaatc ta 
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